Integrating data and models for sustainable decision-making in hydrology

Placeholder Show Content


Climate change results in both long-term droughts and short-term extreme precipitation, which can significantly affect water quality and quantity. To make smart decisions about water resources under uncertain climates, it is important for scientists to convey accurate predictions of water systems to water resource managers. This requires integrating multiple geophysical, geochemical, and hydrologic datasets to build accurate hydrologic models and provide predictions of water flow and quality. However, the model-data integration process can be hindered by challenges such as complex hydrologic modeling, lack of geologically realistic models, and slow or ineffective model calibration methods. These challenges limit the use of model-data integration methods from theory to practice and make it difficult to translate hydrologic models into effective decisions. In this dissertation, we present new method developments for addressing model-data integration's challenges and provide real-world hydrologic examples of using the process of model-data integration. We start by introducing the model-data integration process and associated challenges in Chapter 1. In Chapter 2, we introduce a new geological interface modeling method to integrate multiple datasets and, most importantly, geological knowledge: a data-knowledge-driven trend surface analysis. We define different density functions for different information sources, and sample trend interfaces using the Metropolis-Hastings algorithm with stationary Gaussian field perturbations. This method works for both explicit and implicit interface modeling, where the key advance of the implicit model is to represent complex interfaces and geometries without heavy parameterization. We demonstrate our method in three different test cases: modeling stochastic interfaces of Greenland subglacial topography, magmatic intrusion, and palaeovalleys for groundwater mapping in South Australia. This new trend surface analysis tool is useful for building geological models and hydrostratigraphic layers for hydrologic site characterization. In Chapter 3, we design the hierarchical Bayesian formulation to invert both uncertain global and spatial variables hierarchically. We propose a machine learning-based inversion method that calculates summary statistics using machine learning to invert both linear and non-linear forward models. We also introduce a new local principal component analysis (local PCA) approach that provides a more efficient method for local inversion of large-scale spatial fields. In addition, we provide a likelihood-free inverse method using density estimators, using both traditional kernel density estimation and newly developed neural density estimation. To illustrate the hierarchical Bayesian formulation, one linear volume average inversion, and two non-linear hydrologic modeling cases are presented, including a 3D case study. This Chapter provides possible solutions to many model calibration challenges we face in model-data integration: hierarchical modeling, likelihood definitions, and effective calibration for large spatial fields. In Chapter 4 and Chapter 5, we show two real case studies of model-data integration. Chapter 4 examines the impact of beaver ponds on flow dynamics in a mountainous floodplain in Colorado using hydrologic modeling and model-data integration. The recovery of beavers in North America has been adapted as an ecosystem restoration tool to increase surface and groundwater storage and improve biodiversity on reach scales. We investigate the effects of beavers on hydrologic flows, particularly on the deep baseflow in aquifers, by constructing a 3D hydrologic floodplain model. We calibrate the model to the baseflow piezometer measurement using likelihood-free methods in Chapter 3. Our sensitivity analysis shows that beaver ponds increase the cumulative vertical flow from the fines to the gravel bed but have little effect on the deep underflow in the gravel bed aquifer, suggesting that beaver ponds are disconnected from the main downstream flow. This study aims to improve our understanding of the hydrologic consequences associated with the increasing use of beaver restoration as a climate adaptation strategy. In Chapter 5, we propose a statistical model for constructing 3D redox structures in Danish farmlands to address agricultural nitrogen pollution, which is a global problem that could be exacerbated by hydrologic shifts from climate change. The redox environment in the subsurface is essential for the natural removal of nitrate by denitrification. We combine the towed transient electromagnetic resistivity (tTEM) and redox boreholes to model 3D redox architecture stochastically. However, tTEM survey and redox boreholes are often non-colocated. To address this issue, we perform geostatistical simulations to generate multiple resistivity data colocated with redox boreholes. We then use a statistical learning method, multinomial logistic regression, to predict multiple 3D redox architectures given the uncertain surrounding resistivity structures. We reveal the statistically significant resistivity structures for redox predictions and formulate an inverse problem to better match the redox borehole data using the local PCA method in Chapter 3. These two chapters provide two alternative approaches for providing hydrologic predictions: physics-based modeling or statistical modeling. In Chapter 6, we introduce a fast surrogate flow and transport model to evaluate the climate impact on groundwater contamination. The surrogate modeling approach is applied at the Department of Energy's Savannah River Site F-Area, which contains nuclear wastewater. We present two time-dependent neural network architectures: U-FNO-3D and U-FNO-2D, each with a different approach to incorporating the time dimension. Furthermore, we integrate a custom loss function that takes both data-driven factors and physical boundary constraints into account. This chapter offers a solution to reduce the computational cost of numerical modeling, which is critical in making timely decisions that bridge science and practical applications. This dissertation provides novel methods for geological modeling and model calibration and applies them to real-world problems, highlighting the importance of both method development and practical implementation in addressing hydrologic challenges posed by uncertain climates.


Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2023; ©2023
Publication date 2023; 2023
Issuance monographic
Language English


Author Wang, Lijing
Degree supervisor Caers, Jef
Thesis advisor Caers, Jef
Thesis advisor Maher, Katharine
Thesis advisor Mukerji, Tapan, 1965-
Degree committee member Maher, Katharine
Degree committee member Mukerji, Tapan, 1965-
Associated with Stanford Doerr School of Sustainability
Associated with Stanford University, Department of Geological Sciences


Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Lijing Wang.
Note Submitted to the Department of Geological Sciences.
Thesis Thesis Ph.D. Stanford University 2023.

Access conditions

© 2023 by Lijing Wang
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...