Functional neural network surrogates trained on scarce data

Placeholder Show Content

Abstract/Contents

Abstract
Neural networks (NNs) are often used as surrogates or emulators of partial differential equations (PDEs) that describe the dynamics of complex systems. A virtually negligible computational cost of such surrogates renders them an attractive tool for ensemble-based computation, which requires a large number of repeated PDE solves. Since the latter are also needed to generate sufficient data for NN training, the usefulness of NN-based surrogates hinges on the balance between the training cost and the computational gain stemming from their deployment. To reduce this cost, we propose to use multifidelity simulations in order to increase the amount of data one can generate during the allocated computing time. High-, and low-fidelity images are generated by solving PDEs on fine and coarse meshes, respectively; these multifidelity data are then patched together in the process of training a deep convolutional NN (CNN) using transfer learning. This strategy is further generalized to incorporate three levels of multifidelity data. We use theoretical results for multilevel Monte Carlo to guide our choice of the numbers of simulations (and resultant images) of each kind. This multifidelity training is used to estimate the distribution of a quantity of interest, whose dynamics is governed by a system of nonlinear PDEs with uncertain/random coefficients (e.g., parabolic PDEs governing the dynamics of multiphase flow in heterogeneous porous media). Our numerical experiments demonstrate that a mixture of a comparatively large amount of low-fidelity data and a much smaller amount of high-fidelity data provides an optimal balance between computational speedup and prediction accuracy. When used in the context of uncertainty quantification, our multifidelity strategy is several orders of magnitude faster than either CNN training on high-fidelity images only or Monte Carlo solution of the PDEs. This computational speedup is achieved while preserving the accuracy of the estimators of the distributions of the quantities of interest, as expressed in terms of both the Wasserstein distance and the Kullback--Leibler divergence. To further reduce the cost of data generation, we demonstrate that one can start the CNN training for a new task (a given set of PDEs describing, e.g., reactive transport) from a CNN that was originally trained for a different task (another set of PDEs, e.g., for multiphase flow). Our numerical experiments show that this transfer learning approach yields a considerable speedup when applied to the problem of the estimation of the distribution of a quantity of interest, whose dynamics is prescribed by a system of nonlinear PDEs for advection-dispersion transport in porous media with uncertain/random conductivity field. For a given amount of training data, the method has equal or greater prediction accuracy and generalizability to unseen inputs than a CNN whose training is initialized randomly.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2022; ©2022
Publication date 2022; 2022
Issuance monographic
Language English

Creators/Contributors

Author Song, Dong Hee
Degree supervisor Tartakovsky, Daniel
Thesis advisor Tartakovsky, Daniel
Thesis advisor Horne, Roland N
Thesis advisor Kovscek, Anthony R. (Anthony Robert)
Degree committee member Horne, Roland N
Degree committee member Kovscek, Anthony R. (Anthony Robert)
Associated with Stanford University, Department of Energy Resources Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Dong H. Song.
Note Submitted to the Department of Energy Resources Engineering.
Thesis Thesis Ph.D. Stanford University 2022.
Location https://purl.stanford.edu/zx469sx0349

Access conditions

Copyright
© 2022 by Dong Hee Song
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...