Developing Efficient Strategies for Automatic Calibration of Computationally Intensive Environmental Models
Razavi, Seyed Saman
MetadataShow full item record
Environmental simulation models have been playing a key role in civil and environmental engineering decision making processes for decades. The utility of an environmental model depends on how well the model is structured and calibrated. Model calibration is typically in an automated form where the simulation model is linked to a search mechanism (e.g., an optimization algorithm) such that the search mechanism iteratively generates many parameter sets (e.g., thousands of parameter sets) and evaluates them through running the model in an attempt to minimize differences between observed data and corresponding model outputs. The challenge rises when the environmental model is computationally intensive to run (with run-times of minutes to hours, for example) as then any automatic calibration attempt would impose a large computational burden. Such a challenge may make the model users accept sub-optimal solutions and not achieve the best model performance. The objective of this thesis is to develop innovative strategies to circumvent the computational burden associated with automatic calibration of computationally intensive environmental models. The first main contribution of this thesis is developing a strategy called “deterministic model preemption” which opportunistically evades unnecessary model evaluations in the course of a calibration experiment and can save a significant portion of the computational budget (even as much as 90% in some cases). Model preemption monitors the intermediate simulation results while the model is running and terminates (i.e., pre-empts) the simulation early if it recognizes that further running the model would not guide the search mechanism. This strategy is applicable to a range of automatic calibration algorithms (i.e., search mechanisms) and is deterministic in that it leads to exactly the same calibration results as when preemption is not applied. One other main contribution of this thesis is developing and utilizing the concept of “surrogate data” which is basically a reasonably small but representative proportion of a full set of calibration data. This concept is inspired by the existing surrogate modelling strategies where a surrogate model (also called a metamodel) is developed and utilized as a fast-to-run substitute of an original computationally intensive model. A framework is developed to efficiently calibrate hydrologic models to the full set of calibration data while running the original model only on surrogate data for the majority of candidate parameter sets, a strategy which leads to considerable computational saving. To this end, mapping relationships are developed to approximate the model performance on the full data based on the model performance on surrogate data. This framework can be applicable to the calibration of any environmental model where appropriate surrogate data and mapping relationships can be identified. As another main contribution, this thesis critically reviews and evaluates the large body of literature on surrogate modelling strategies from various disciplines as they are the most commonly used methods to relieve the computational burden associated with computationally intensive simulation models. To reliably evaluate these strategies, a comparative assessment and benchmarking framework is developed which presents a clear computational budget dependent definition for the success/failure of surrogate modelling strategies. Two large families of surrogate modelling strategies are critically scrutinized and evaluated: “response surface surrogate” modelling which involves statistical or data–driven function approximation techniques (e.g., kriging, radial basis functions, and neural networks) and “lower-fidelity physically-based surrogate” modelling strategies which develop and utilize simplified models of the original system (e.g., a groundwater model with a coarse mesh). This thesis raises fundamental concerns about response surface surrogate modelling and demonstrates that, although they might be less efficient, lower-fidelity physically-based surrogates are generally more reliable as they to-some-extent preserve the physics involved in the original model. Five different surface water and groundwater models are used across this thesis to test the performance of the developed strategies and elaborate the discussions. However, the strategies developed are typically simulation-model-independent and can be applied to the calibration of any computationally intensive simulation model that has the required characteristics. This thesis leaves the reader with a suite of strategies for efficient calibration of computationally intensive environmental models while providing some guidance on how to select, implement, and evaluate the appropriate strategy for a given environmental model calibration problem.