Multivariate Gaussian Process Surrogate Models in High Dimensions

Robert Armstrong | 21-FS-037

Project Overview

There is a need for improved predictive models that implicitly account for uncertainty in computationally demanding problems. In this project, we explored the feasibility of using Gaussian Processes (GPs) to build a surrogate model for the inverse problem of multiple correlated variables in large dimensions. We extended the fast approximate GP code, MuyGPyS, to incorporate a multivariate model and tested it on cosmological simulation data. For our validation dataset, we found excellent agreement for both 1D posterior constraints and coverage from the GP results compared to Markov Chain Monte Carlo. However, we found that the hyperparameter optimization strategy used in MuyGPyS was not always sensitive to some parameters and therefore needed a unique estimation approach. Further work is needed to understand how to better constrain the correlation parameters. To achieve good results, also required a training set size of more than a few thousand. Because of this requirement, we were unable to do tests on large cosmological N-body simulations we were planning, but future work could overcome this limitation. Our work demonstrates the potential gain in using GP models for parameter inference in the inverse problem compared to other techniques.

Mission Impact

This work directly impacts our ability to build predictive models for complex data-driven problems that require accurate uncertainty quantification. Our GP method, in contrast to many other machine learning methods, shows accurate and statistically consistent uncertainty predictions. Given that many different applications use simulation-based models, it could have a potentially broad impact on many different areas at Lawrence Livermore National Laboratory. It highlights the power of GP-based models as an alternative to modern methods. This is a particularly promising new line of research for future problems requiring accurate statistical inference.