Computational Framework for Data Assimilation and Uncertainty Management of Large-Dimensional Dynamics Models

Xiao Chen (16-ERD-023)

Executive Summary

We are developing a high-performance computing capability to perform stochastic source-inversions of unpredictable systems with random variables, such as seismic inversion, carbon dioxide sequestration, and power-grid management. This tool could precisely determine the degree of confidence scientists can have in simulation-based or data-driven predictions for variety of applications across DOE missions.

Project Description

Many important, real-life applications face the need to conduct source inversion—that is, recovering the true characterization of an unknown source field from noisy measurement data and computer simulations of the underlying physical processes. For example, accurate characterization of the Earth's structure using seismic data is important for reliable nuclear explosion monitoring and ground motion hazard assessment. In power-grid management, accurate representation of the state of the coupled distribution and transmission network (the source) is important for enhancing control strategies for fast-demand response and proactive vulnerability prevention. However, several challenges arise in a typical inversion process. First, the inversion may require tens of thousands of expensive model simulations. Second, the complex models may not accurately represent the underlying physical processes. Third, the measurement data may be sparse and polluted with random noises. Existing inversion algorithms suffer from intensive computational cost and the assumption of a continuous probability distribution (Gaussian assumption), impacting the credibility of decision-making relevant to the applications. We are developing a high-performance-computing capability for large-scale source inversion of high-dimensional stochastic models (i.e., unpredictable systems with random variables). Our scope includes high-dimensional data analysis, stochastic data assimilation, and nonlinear optimization. We will design efficient and robust algorithms for general and complex inversion and develop an integrated software framework to manage algorithmic complexity and model uncertainties. We will demonstrate this capability on selected, relevant test problems, such as seismic inversion, carbon dioxide sequestration, and power-grid management. Our capability will enable scientists to make precise statements about the degree of confidence they have in their simulation-based and data-driven predictions.

Our mathematical framework and the associated software platform will enable several important Laboratory applications to more efficiently and robustly address their stochastic source-inversion problems and manage uncertainties. With this platform, practitioners and researchers will have a tool to identify any potential risk of a complex stochastic simulation based on measurements data that cannot presently be fully evaluated. The project will also expand the envelope of uncertainty quantification and stochastic data-assimilation methods to handle more general and complex scenarios. The software package will be made available to the wider scientific community for use and for further collaborative development. Our research effort will impact Livermore's high-performance computing and simulation by enhancing predictive simulations and providing a use case for exploring high-performance computer designs.

Mission Relevance

Our project supports the NNSA goal of expanding and applying our science and technology capabilities to deal with broader national security challenges. It also aligns with Livermore's core competencies in high-performance computing, simulation, and data science and Earth and atmospheric science core competency.

FY17 Accomplishments and Results

In FY17, we (1) incorporated a derivative-based optimal control algorithm into the Data Assimilation for Stochastic Source Inversion software; (2) enhanced the kernel principal component analysis module with pre-imaging using a non-iterative algorithm; (3) added a weighted kernel scheme and a diffusion map capability into the module; (4) added random walk (gradient-free) Markov Chain Monte Carlo and Langevin (gradient-enhanced) Markov Chain Monte Carlo capabilities to the inversion software; (5) used the inversion software for stochastic inversion of the sub-surface elasticity parameters; (6) developed a novel interpolation method to predict subsurface structural properties using well-bore logs; and (7) designed a synthetic two-dimensional case to test the software on stochastic inversion for seismic structural models.


Figure1.
(a) The unknown channelized geological elasticity random field to recover; the lambda in unit MPa stands for the elastic properties of the underlying material. (b) The low-dimensional channelized elasticity random field generated by nonlinear dimension reduction approach. (c) The estimation of posterior mean values inverted by gradient-enhanced and reduced-order stochastic inversion. (d) The estimation of posterior standard deviation values calculated by the reduced-order stochastic inversion. The convergence rate is improved significantly through kernel Principal Component Analysis (PCA)-based machine learning, coupled with gradient-based stochastic inversion.

Publications and Presentations

Thimmisetty, C. A., et al., 2017. "High Dimensional Intrinsic Interpolation Using Gaussian Process Regression." Mathematical Geosciences. 1-20. LLNL-JRNL-728760.

&nbsp &nbsp