Controlling the false discovery rate with dynamic adaptive procedures and of grouped hypotheses

MacDonald, Peter William

Controlling the false discovery rate with dynamic adaptive procedures and of grouped hypotheses

Files

MacDonald_Peter.pdf (684.05 KB)

Date

2018-08-08

Authors

MacDonald, Peter William

Advisor

Liang, Kun

Publisher

University of Waterloo

Abstract

In the multiple testing problem with independent tests, the classical Benjamini-Hochberg (BH) procedure controls the false discovery rate (FDR) below the target FDR level. Adaptive procedures can improve power by incorporating estimates of the proportion of true null hypotheses, which typically rely on a tuning parameter. Fixed adaptive procedures set their tuning parameters before seeing the data and can be shown to control the FDR in finite samples. In Chapter 2 of this thesis, we develop theoretical results for dynamic adaptive procedures whose tuning parameters are determined by the data. We show that, if the tuning parameter is chosen according to a left-to-right stopping time rule, the corresponding dynamic adaptive procedure controls the FDR in finite samples. Examples include the recently proposed right-boundary procedure and the widely used lowest-slope procedure, among others. Simulation results show that the right-boundary procedure is more powerful than other dynamic adaptive procedures under independence and mild dependence conditions. The BH procedure implicitly assumes all hypotheses are exchangeable. When hypotheses come from known groups, this assumption is inefficient, and power can be improved through a ranking of significance that incorporates group information. In Chapter 3 of this thesis, we define a general sequential framework for multiple testing procedures in the grouped setting. We develop a flexible grouped mirrored knockoff (GMK) procedure which approximates the optimal ranking of significance. We show that the GMK procedure controls the FDR in finite samples, and give a particular data-driven implementation using the expectation-maximization algorithm. Simulation and a real data example demonstrate that the GMK procedure outperforms its competitors in terms of power and FDR control with independent tests.

Keywords

simultaneous inference, false discovery rate, multiple hypothesis testing, martingale, stopping time

URI

http://hdl.handle.net/10012/13549

Collections

Theses
Statistics and Actuarial Science

Full item page

Controlling the false discovery rate with dynamic adaptive procedures and of grouped hypotheses

Files

Date

Authors

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

LC Subject Headings

Citation

URI

Collections