Andrew Gillette | 20-FS-034
The usefulness of a trained neural network for scientific applications hinges on whether the network will give accurate outputs on previously unseen inputs. When new inputs lie "inside" the training data set, the assignment of an output is a problem of interpolation, raising the question that was the focus of this feasibility study: Is it possible to measure how well a given trained neural network performs at interpolation tasks? More precisely, can a metric be devised that gives larger values when a trained neural network is "farther" from a standard interpolation procedure and smaller values when it is "closer" to it? The research conducted during this project demonstrated that the answer to this question is yes. A quantity called the Delaunay loss was defined rigorously and shown to provide the desired type of assessment. To compute this quantity, a research code was developed and tested on data sets with as many as 50 dimensions and consisting of tens of thousands of points, and included down-sampled versions of the publicly available modified National Institute of Standards and Testing data set. The Delaunay loss is an objective benchmark measurement, agnostic to application context, and could be implemented efficiently in higher dimensions with additional effort. This new tool offers great potential to aid in the broad mission of integrating machine learning and simulation workflows for predictive analyses.
Our project addresses a statement issued by the DOE regarding the current state of scientific machine learning and artificial intelligence (AI): "Educated trial and error continues to guide advances in science applications of AI...Having a better understanding of the relationships between data and models would have an enormous impact in connecting research on restricted data with advances in open science." In addition, this project supports Lawrence Livermore National Laboratory's core competency in high-performance computing, simulation, and data science. The new measurement tool of Delaunay loss, discovered by this feasibility study, offers a means to escape the trial-and-error loop by connecting existing machine-learning heuristics to long-established mathematical results in high-dimensional functional approximation. Potential, but as yet unexplored, future directions include a geometric criterion for identifying adversarial examples in neural networks, enabling interpretability of deep-learning models via comparison to piecewise linear interpolation, and aiding in the transfer of networks across experimental design setups.
Publications, Presentations, and Patents
Gillette, A. and T. Chang, 2021. "Assessing Latent Space Dimension by Delaunay Loss." Winter Conference on Applications of Computer Vision 2021 (virtual), January 2021. LLNL-CONF-814930
Gillette, A. 2020 "How I Learned to Stop Worrying and Love Machine Learning Research." Lawrence Livermore National Laboratory 2020 Computing Virtual Seminar Series. LLNL-PRES-812692