Zhu, Feiyu2022-05-242022-05-242022-05-242022-05-10http://hdl.handle.net/10012/18325Our work aims to solve some of the most significant and fundamental theoretical problems involved in the current statistical modeling of stochastic processes in single-molecule experiments, for which a well recognized yet mathematically very difficult model is the generalized Langevin equation (GLE). We mainly focus on the following three directions. In Chapter 2, we prove a remarkable representation theorem that fundamentally connects a continuous stationary process, a widely adopted statistical model in various applications, with the GLE, a ubiquitous tool in physics to model stochastic dynamics in a thermodynamic system. However, there are two important statistical challenges. First, the dynamics of the observed particle must be deconvoluted from a postulated covariance structure of an unobservable thermal force, rendering statistical modeling typically intractable. Second, for a given covariance structure of the latent force, the likelihood function for parameter estimation can rarely be written in closed form in the time domain. Parameter estimation that involves numerical approximation of the given memory kernel via repeated application of the Fast Fourier Transform (FFT) often incurs significant information loss and computational burden. We aim to fill the gaps by establishing a representation theorem that any continuous stationary process can be represented by a physically valid GLE. The upshot is that statistical modeling and inference can be performed entirely in the time domain with the guarantee of satisfying the fundamental laws of physics. The result can also be extended to continuous process with only stationary increments. In Chapter 3, we carefully study the asymptotic properties of some important spectral density estimators for high-throughput (HTP) data commonly obtained in modern nanoscopic scientific experiments where the sampling frequency of an underlying process set to be extremely large and the recordings of such a continuous process can also be extended for a very long time, giving us more and more observations. Traditional asymptotic results with fixed sampling frequency would break down in such a situation. In the current literature, the asymptotic results for the spectral density estimator given HTP data are rarely seen to the best of our knowledge at the time of writing our work (Lysy et al., 2022). We fill this gap by laying the theoretical foundations for high-frequency sampled stationary processes, based on which a novel and effective two-stage approach was proposed by Lysy et al. (2022) to get a robust and efficient parametric estimation for the noisy HTP data. In Chapter 4, we design an original non-degenerate sampling scheme using the particle filter method with bridge proposal to better estimate parameters in the quasi-Markovian approximation of the GLE which exhibits hypoellipticity. The proposed method if evaluated numerically would be extremely helpful for efficient parameter estimation in statistical modeling and inference problems tackling the nonlinear GLE.engeneralized Langevin equationsingle-particle dynamicsHamiltonian systemssemigroupsskew-adjoint operatorspectral theoremspectral densityautocorrelationsstationary processeshigh-frequency asymptoticshigh-throughput dataWhittle likelihoodlog-periodogramquasi-Markovian approximationEuler-Maruyama discretizationIto-Taylor expansionintegrated Wiener processparticle filterhypoelliptic diffusionsdiffusion bridgeSingle-Particle Dynamics in Nanoscopic Systems: Statistical Modeling and InferenceDoctoral Thesis