TITLE: Fast methods for identifying high dimensional systems using observations

ABSTRACT:

Computational modeling is a popular tool to understand a diverse set of complex systems. The output from a computational model depends on a set of parameters which are unknown to the designer, but a modeler can estimate them by collecting physical data. In the second chapter of this thesis, we study the action potential of ventricular myocytes and our parameter of interest is a function as opposed to a scalar or a set of scalars. We develop a new modeling strategy to nonparametrically study the functional parameter using Bayesian inference with Gaussian process priors. We also devise a new Markov chain Monte Carlo sampling scheme to address this unique problem. In the more general case, computational simulation is expensive. Emulators avoid the repeated use of a stochastic simulation by performing a designed experiment on the computer simulation and developing a predictive distribution.  Random field models are considered the standard in analysis of computer experiments, but the current framework fails in high dimensional scenarios because of the cost of inference. The third chapter of this thesis shows by using a class of experimental designs, the computational cost of inference from random fields scales significantly better in high dimensions. Exact prediction and likelihood evaluation with close to half a million design points is possible in seconds using only a laptop computer. Compared to the more common space-filling designs, the proposed designs are shown to be competitive in terms of prediction accuracy through simulation and analytic results. The fourth chapter of this thesis proposes a method to construct an emulator for a stochastic simulation. Existing emulators have focused on estimation of the mean of the simulation output, but this work presents an emulator for the distribution of the output in a nonparametric setting. This construction provides both an explicit distribution and a fast sampling scheme. Beyond describing the emulator, this work demonstrates that the emulator's convergence rate is asymptotically rate optimal among all possible emulators using the same sample size.  Lastly, the fifth chapter of this work investigates the use of a modified version of the above method to study patterns of defects on products. We achieve efficient inference on the defect patterns by developing a novel estimate of an inhomogeneous point process that is both computationally tractable and asymptotically appealing.