Enhanced Backward Multiple Change-Point Detection
Abstract
Many statistical tools are built upon a specific set of assumptions on the distribution of the
data at hand. However, the distribution of the observations in the dataset may not remain
constant and may change due to some external events. For a sequence of observations,
the points after which the distribution function has changed are commonly referred to
as change points. Identifying such points can also be critical in gaining insights into the
distributional behaviour of random variables and constructing statistical models. Thus,
the change points analysis potentially applies to almost all data-driven disciplines, such as
biology, finance, and public policies.
Change points analysis is categorized into online and offline analysis. The online change
points analysis is designed to detect changes in the distribution of random variables as
new observations are introduced. On the other hand, offline analysis is concerned with
recovering change points within a historical dataset. In this thesis, we are only concerned
with offline change point analysis; for simplicity, we refer to offline change points analysis
as change points analysis.
Change point analysis was born 70 years ago from the quality control discipline Page
(1954). Initially, the main focus of the change points literature was on the single change
point scenario in which, at most, one change point exists within a sequence of random vari-
ables. However, with the advent of computers, the focus has switched to multiple change
point detection problems. This shift does not imply that single change point detection
methods are irrelevant. For instance, many multiple change point detection methods re-
cover change points by conducting a single change point test locally. This class of change
point detection methods is called local search methods.
One of the primary concerns of local search methods is the application of a single change
point test statistic within the largest possible segment of the sequence of random variables
with exactly one change point. Obtaining such intervals is a difficult task. For instance,
wild binary segmentation Fryzlewicz et al. (2014) extracts the change points from intervals
containing multiple change points. On the other hand, the narrowest over threshold Bara-
nowski et al. (2019) estimates the change points within the narrowest intervals in which a
predefined threshold is satisfied. Thus, the accuracy of the estimated locations of change
points may suffer due to the shortness of these intervals. In this thesis, we propose two local
search methods that attempt to infer locations of change points within the desirable in-
tervals. The first method, enhanced backward detection (EBD), recovers the change points
by eliminating unlikely candidates sequentially. The second method, i.e., narrowest over
threshold via interval selection with shortened exhaustive search (NOT-IS.SES), estimates the location of change points by following a top-down approach. That is, the change points
are added to the active set sequentially. EBD and NOT-IS.SES are general procedures that
can be applied to a wide range of change point problems by simply changing the underlying
single change point test statistics.
Collections
Cite this version of the work
Shahab Pirnia
(2023).
Enhanced Backward Multiple Change-Point Detection. UWSpace.
http://hdl.handle.net/10012/19431
Other formats