Efficient Bayesian Computation with Applications in Neuroscience and Meteorology

Loading...
Thumbnail Image

Date

2024-08-23

Advisor

Lysy, Martin
Ramezan, Reza

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Hierarchical models are important tools for analyzing data in many disciplines. Efficiency and scalability in model inference have become increasingly important areas of research due to the rapid growth of data. Traditionally, parameters in a hierarchical model are estimated by deriving closed-form estimates or Monte Carlo sampling. Since the former approach is only possible for simpler models with conjugate priors, the latter, Markov Chain Monte Carlo (MCMC) methods in particular, has become the standard approach for inference without a closed form. However, MCMC requires substantial computational resources when sampling from hierarchical models with complex structures, highlighting the need for more computationally efficient inference methods. In this thesis, we study the design of Bayesian inference to improve computational efficiency, with a focus on a class of hierarchical models known as \textit{latent Gaussian models}. The background of hierarchical modelling and Bayesian inference is introduced in Chapter 1. In Chapter 2, we present a fast and scalable approximate inference method for a widely used model in meteorological data analysis. The model features a likelihood layer of the generalized extreme value (GEV) distribution and a latent layer integrating spatial information via Gaussian process (GP) priors on the GEV parameters, hence the name GEV-GP model. The computational bottleneck is caused by the high number of spatial locations being studied, which corresponds to the dimensionality of the GPs. We presented an inference procedure based on the Laplace approximation to the likelihood followed by a Normal approximation to the posterior of interest. By combining the above approach with a sparsity-inducing spatial covariance approximation technique, we demonstrate through simulations that it accurately estimates the Bayesian predictive distribution of extreme weather events, scales to several thousand spatial locations, and is several orders of magnitude faster than MCMC. We also present a case study in forecasting extreme snowfall across Canada. Building on the approximate inference scheme discussed in Chapter 2, Chapter 3 introduces a new modelling framework for capturing the correlation structure in high-dimensional neuronal data, known as \textit{spike trains}. We propose a novel continuous-time multi-neuron latent factor model based on the biological mechanism of spike generation, where the underlying neuronal activities are represented by a multivariate Markov process. To the best of our knowledge, this is the first multivariate spike-train model in a continuous-time setting to study interactions between neural spike trains. A computationally tractable Bayesian inference procedure is proposed to address the challenges in estimating high-dimensional latent parameters. We show that the proposed model and inference method can accurately recover underlying neuronal interactions when applied to a variety of simulated datasets. Application of our model on experimental data reveals that the correlation structure of spike trains in rats' orbitofrontal cortex predicts outcomes following different cues. While Chapter 3 restricts modelling to Markov processes for the latent dynamics of spike trains, Chapter 4 presents an efficient inference method for non-Markov stationary GPs with noisy observations. While computations for such models typically scale as $\mathcal{O}(n^3)$ in the number of observations $n$, our method utilizes a ``superfast'' Toeplitz system solver which reduces computational complexity to $\mathcal{O}(n \log^2 n)$. We demonstrate that our method dramatically improves the scalability of Gaussian Process Factor Analysis (GPFA), which is commonly used for extracting low-dimensional representation for high-dimensional neural data. We extend GPFA to accommodate Poisson count observations and design a superfast MCMC inference algorithm for the extended GPFA. The accuracy and speed of our inference algorithms are illustrated through simulation studies.

Description

Keywords

hierarchical models, bayesian computation, computational neuroscience, spatial statistics, approximate inference

LC Subject Headings

Citation