Zeroth-Order Machine Learning for Flexible Integration of Domain Knowledge
James Diffenderfer | 22-FS-019
Scientific machine learning (SciML) can enhance scientific workflows (SW) (e.g., simulations) or machine learning (ML) by creating synergistic models built on ML foundations that use domain knowledge as modules in an ML model. SciML models could be used to accelerate the scientific discovery process; however, training is limited to systems where gradient information is available. In many instances, gradient information is not provided by an SW or is nonexistent, yielding untrainable SciML models. In this project, we explored the use of deep neural network (DNN) sparsity and zeroth-order optimization (ZOO) to answer the following feasibility question: Can we leverage gradient-free techniques to train accurate SciML models? Specifically, we discovered that by using ZOO techniques to train DNN correction functions to we could enhance the quality of low-fidelity fluid-dynamics simulations (PhiFlow) beyond the baseline technique for training DNNs for the same task. Notably, our ZOO-trained SciML model produced a 17x reduction in MAE on the test set over the model trained using the baseline technique. This successful demonstration of feasibility is an important first step towards the broader use of ZOO for training SciML models. Additionally, for the first time, we demonstrated that deep neural-network sparsity can effectively be applied in our SciML application, which not only reduces the cost of ZOO training for our SciML applications but also reduces the memory overhead when enhancing simulation codes via deep learning.
Due to its fundamental nature and the widespread impact of ML, results from this project have the potential to benefit all four of LLNL's mission focus areas. The techniques developed and demonstrated in this project are paving the way for fundamental advances in Data Science and the settings in which it can be applied. While the scope of our application was limited (due to the nature of a feasibility study), our methods can provide new insights into many relevant applications with SciML that currently face design constraints due to scientific workflows that do not provide gradient information. The feasibility demonstrated in this project and the tools herein developed can serve as the groundwork for a comprehensive ZOO training framework that has the potential to impact several mission-critical applications, including projects in Advanced Materials and Non-Destructive Evaluation (NDE) facing this same issue. If developed, such a comprehensive ZOO training framework can directly benefit the Cognitive Simulation (CogSim) Director's initiative and WCI's long-term Virtual Design Assistant (ViDA) strategy. This project successfully established collaborative relationships between LLNL and two active deep-learning research groups: Dr. Sijia Liu's group at Michigan State University and Dr. Yi Zhou's group at the University of Utah. These collaborations will help to strengthen LLNL's position as a leader in deep-learning research and development.