Statistical Inference in ROC Curve Analysis

Hu, Dingding

Statistical Inference in ROC Curve Analysis

Files

Hu_Dingding.pdf (1.29 MB)

Date

2025-07-07

Authors

Hu, Dingding

Advisor

Li, Pengfei

Publisher

University of Waterloo

Abstract

The receiver operating characteristic (ROC) curve is a powerful statistical tool to evaluate the diagnostic abilities of a binary classifier for varied discrimination thresholds. It has been widely applied in various scientific areas. This thesis considers three inference problems in the ROC curve analysis. In Chapter 1, we introduce the basic concept of the ROC curve, along with some of its summary indices. We then provide an overview of the research problems and outline the structure of the subsequent chapters. Chapter 2 focuses on improving the ROC curve analysis with a single biomarker by incorporating the assumption that higher biomarker values indicate greater disease severity or likelihood. We interpret “greater severity” as a higher probability of disease, which corresponds to the likelihood ratio ordering between diseased and healthy individuals. Under this assumption, we propose a Bernstein polynomial-based method to model and estimate the biomarker distributions using the maximum empirical likelihood framework. From the estimated distributions, we derive the ROC curve and its summary indices. We establish the asymptotic consistency of our estimators and validate their performance through extensive simulations and compare them with existing methods. A real-data example is used to demonstrate the practical applicability of our approach. Chapter 3 considers the ROC curve analysis for medical data with non-ignorable missingness in the disease status. In the framework of the logistic regression models for both the disease status and the verification status, we first establish the identifiability of model parameters, and then propose a likelihood method to estimate the model parameters, the ROC curve, and the area under the ROC curve (AUC) for the biomarker. The asymptotic distributions of these estimators are established. Via extensive simulation studies, we compare our method with competing methods in the point estimation and assess the accuracy of confidence interval estimation under various scenarios. To illustrate the application of the proposed method in practical data, we apply our method to the Alzheimer's disease dataset from the National Alzheimer's Coordinating Center. Chapter 4 explores the combination of multiple biomarkers when disease status is determined by an imperfect reference standard, which can lead to misclassification. Previous methods for combining multiple biomarkers typically assume that all disease statuses are determined by a gold standard test, limiting their ability to accurately estimate the ROC curve and AUC in the presence of misclassification. We propose modeling the distributions of biomarkers from truly healthy and diseased individuals using a semiparametric density ratio model. Additionally, we adopt two assumptions from the literature: (1) the biomarkers are conditionally independent of the classification of the imperfect reference standard given the true disease status, and (2) the classification accuracy of the imperfect reference standard is known. Using this framework, we establish the identifiability of model parameters and propose a maximum empirical likelihood method to estimate the ROC curve and AUC for the optimal combination of biomarkers. An Expectation-Maximization algorithm is developed for numerical calculation. Additionally, we propose a bootstrap method to construct the confidence interval for the AUC and the confidence band for the ROC curve. Extensive simulations are conducted to evaluate the robustness of our method with respect to label misclassification. Finally, we demonstrate the effectiveness of our method in a real-data application. In Chapter 5, we provide a brief summary of Chapters 2-4 and outline several directions for future research.