Knowledge-Driven Machine Learning
Jayaraman Jayaraman Thiagarajan | 21-ERD-012
Project Overview
Incorporating domain knowledge into predictive modeling is a natural approach to build reliable models with limited data in scientific problems. Most often, this involves the design of appropriate (and differentiable) constraints/losses and customized architectures. While this has led to a broad class of domain-aware machine learning solutions, for continued progress, one needs to leverage domain knowledge comprised of heterogeneous entities and complex relationships while building statistical models. Knowledge representations in practical applications are available in different modalities and complexities (e.g., knowledge graphs, non-differentiable simulators, family of data generation processes, specification of physical models based on parallel discrete event simulations (PDES)) and there is a need for unified learning frameworks to effectively leverage those knowledge sources. In addition to producing physically grounded models (improved generalization), this new class of solutions can help integrate decision support into machine learning (ML) pipelines. Through this project, we have developed a novel knowledge-driven artificial intelligence (AI) framework that can systematically leverage information from different knowledge encodings.
Mission Impact
In general, domain knowledge can play a central role in the design of ML models, the collection and preparation of data, and even the interpretation of results. This project falls directly in the Data Science Core Competency by developing new theories and algorithms useful for a broad range of predictive sciences. Due to its fundamental nature the results from this project have the potential to impact almost all five mission focus areas of the laboratory to a greater or smaller degree. This project has also led to several high-impact publications at prestigious conferences including ICML 2022, NeurIPS 2022, ICLR 2023, ICML 2023, WACV 2023, ICASSP 2023, KDD 2022 and WebConf 2022. Whether it is enabling knowledge-graph driven bio-resilience, simulation-guided experiment design, non-proliferation, or supporting small molecule design, knowledge-driven ML can enable unprecedented opportunities for data-efficient and reliable AI modeling. Our solutions are already being adopted by several ongoing efforts at Lawrence Livermore National Laboratory and we expect our software framework to enable a wider adoption of these tools in other mission-critical applications.
Publications, Presentations, Patents
P. Trivedi, D. Koutra, J. J. Thiagarajan, "A Closer Look at Scoring Functions and Generalization Prediction" (Presentation, IEEE ICASSP 2023, Rhodes Island, Greece, June 2023). LLNL-CONF-842368.
K. Thopalli, R. Subramanyam, P. Turaga, J. J. Thiagarajan, "SiSTA: Target-Aware Generative Augmentations for Single-Shot Adaptation" (Presentation, ICML 2023, Honolulu, HI, July 2023). LLNL-CONF-844756.
R. Subramanyam, K. Thopalli, S. Berman, P. Turaga, J. J. Thiagarajan, "Single-Shot Domain Adaptation via Target-Aware Generative Augmentations" (Presentation, IEEE ICASSP 2023, Rhodes Island, Greece, June 2023), LLNL-CONF-842367.
R. Subramanyam, M. Heimann, T. S. Jayram, R. Anirudh, J. J. Thiagarajan, "Contrastive Knowledge-Augmented Meta-Learning for Few-Shot Classification" (Presentation, WACV 2023, Waikoloa, HI, January 2023), LLNL-CONF-833730.
P. Trivedi, D. Koutra, J. J. Thiagarajan, "Exploring the Design of Adaptation Protocols for Improved Generalization and Machine Learning Safety" (Presentation, ICML PODS 2022, Baltimore, MD, July 2022), LLNL-PROC-836992.
R. Subramanyam, V. Narayanaswamy, M. Naufel, A. Spanias, J. J. Thiagarajan, "Improved StyleGAN-v2 based Inversion for Out-of-Distribution Images" (Presentation, ICML 2022, Baltimore, MD, July 2022). LLNL-CONF-829448.
K. Thopalli, P. Turaga, J. J. Thiagarajan, "R3-Labeling Domains Improves Multi-Domain Generalization" (Presentation, IEEE ICASSP 2022, Singapore, May 2022). LLNL-CONF-827848.
P. Trivedi, E. Lubana, M. Heimann, D. Koutra, J. J. Thiagarajan, "Understanding Self-Supervised Graph Representation Learning from a Data-Centric Perspective" (Presentation, Neurips 2022, New Orleans, LA, 2022), LLNL-CONF-835530.
J. J. Thiagarajan, "Comparative Code Structure Analysis using Deep Learning for Performance Prediction" (Presentation, 2021 IEEE International Symposium on Performance Analysis of Systems and Software ISPASS 2021, Virtual, March 2021).
H. Song, B. Kailkhura, J. J. Thiagarajan, "Preventing Failures By Dataset Shift Detection in Safety-Critical Graph Applications." Frontiers in Artificial Intelligence, Volume 4, 2021, LLNL-JRNL-822138.
U. Shanthamallu, J. J. Thiagarajan, A. Spanias, "Uncertainty-Matching Graph Neural Networks to Defend Against Poisoning Attack" (Presentation, AAAI 2021, Virtual, February 2021). LLNL-CONF-814665.