UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Nonparametric Estimation in a Compound Mixture Model and False Discovery Rate Control with Auxiliary Information

Loading...
Thumbnail Image

Date

2020-05-15

Authors

Tian, Zhaoyang

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

In this thesis, we focus on two important statistical problems. The first is the nonparametric estimation in a compound mixture model with application to the malaria study. The second is the control of the false discovery rate in multiple hypothesis testing applications with auxiliary information. Malaria can be diagnosed by the presence of parasites and symptoms (usually fever) due to parasites. In endemic areas, however, an individual may have fever attributable either to malaria or to other causes. Thus, the parasite level of an individual with fever follows a two-component mixture distribution, with the two components corresponding to malaria and nonmalaria individuals. Furthermore, the parasite levels of nonmalaria individuals can be characterized as a mixture of a zero component and a positive distribution, while the parasite levels of malaria individuals can only be positive. Therefore, the parasite level of an individual with fever follows a compound mixture model. In Chapter 2, we propose a maximum multinomial likelihood approach for estimating the unknown parameters/functions using parasite-level data from two groups of individuals: the first group only contains the malaria individuals, while the second group is a mixture of malaria and nonmalaria individuals. We develop an EM-algorithm to numerically calculate the maximum multinomial likelihood estimates and further establish their convergence rates. Simulation results show that the proposed maximum multinomial likelihood estimators are more efficient than existing nonparametric estimators. The proposed method is used to analyze a malaria survey data. In many multiple hypothesis testing applications, thousands of null hypotheses are tested simultaneously. For each null hypothesis, usually a test statistic and the corresponding p-value are calculated. Traditional rejection rules work on p-values and hence ignore the signs of the test statistics in two-sided tests. However, the signs may carry useful directional information in two-group comparison settings. In Chapter 3, we introduce a novel procedure, the signed-knockoff procedure, to utilize the directional information and control the false discovery rate in finite samples. We demonstrate the power advantage of our procedure through simulation studies and two real applications. In Chapter 4, we further extend the signed-knockoff procedure to incorporate additional information from covariates, which are subject to missing. We propose a new procedure, the covariate and direction adaptive knockoff procedure, and show that our procedure can control the false discovery rate in finite samples. Simulation studies and real data analysis show that our procedure is competitive to existing covariate-adaptive methods. In Chapter 5, we summarize our contributions and outline several interesting topics worthy of further exploration in the future.

Description

Keywords

LC Keywords

Nonparametric statistics, Malaria, Testing, Statistical methods

Citation