Assessing Binary Measurement Systems Using Targeted Verification with a Gold Standard

Severn, Daniel Ernest

Assessing Binary Measurement Systems Using Targeted Verification with a Gold Standard

Files

Severn_Daniel.pdf (3.5 MB)

Date

2017-05-19

Authors

Severn, Daniel Ernest

Advisor

Stefan, Steiner

Publisher

University of Waterloo

Abstract

Binary Measurement Systems (BMS) are used to classify objects into two categories. Sometimes the categories represent some intrinsically dichotomous characteristic of the object, but sometimes continuous or even multidimensional characteristics are simplified into a dichotomy. In medicine, pregnancy is the typical example of a truly dichotomous characteristic; whereas Alzheimer’s disease may be a continuous or multidimensional characteristic that one may none-the-less wish to simplify into a dichotomy in diagnosis. In both cases BMS are used to classify the patient into two categories, pregnant or not pregnant, diseased or non-diseased. Most BMS are not inerrant, they misclassify patients and these misclassifications can have very damaging consequences for the patients’ health. Therefore in the search to understand and improve the BMS being used or developed, there needs to be a formalized way of studying and judging the merits of a BMS. While BMS are used throughout society, the two main areas where they are formalized in this way are medicine and manufacturing. Medical BMS are designed to determine the presence of a disease or other medical condition. Manufacturing BMS are designed to determine whether a manufactured item meets a specified quality standard. This abstract will use language and examples typical in the medical application because this is easier to understand and relate to for most people. However most of the thesis was written with an eye to publication in journals for quality improvement and thus typically is written for that audience. There are two primary attributes of BMS that are used to judge their quality: when measuring a subject once with the BMS what is the probability of a false positive diagnosis, and what is the probability of a false negative diagnosis. In the standard statistical framework (PPDAC – Problem, Plan, Data, Analysis, and Conclusion), the problem this thesis tries to address is determining these two quantities for a BMS. It develops new plans and estimation techniques for this purpose. These plans assume that a perfect “gold standard” measurement system is available. It also assumes that it is possible to repeatedly measure a subject, and one measurement does not affect other measurements. The plans in this thesis consider reducing the number of gold standard measurements needed for a given level of precision as a primary goal. The context usually implies that there is some difficulty in using the gold standard measurement system in practice; were this not the case the gold standard could be used instead of the BMS being assessed. For example some gold standard measurement systems can only be performed on a dead patient while, the BMS being assessed is intended for a living patient. Alternately the gold standard could be very expensive because no errors are permitted. The thesis considers two scenarios; one assessing a new BMS where no information is available prior to the study and where only sampling directly from the population of subjects is possible. The second, assessing a BMS that is currently in use where some information is available prior to the study and where subjects previously classified by the BMS are available to sample from. Chapters 2 and 3 consider the first scenario, while Chapters 4 and 5 consider the second scenario. Chapter 1 gives an introduction to the assessment of BMS and a review of the academic literature relevant to this thesis. Chapter 2 considers a sequential statistical plan for assessing a BMS that introduces a new innovative design concept called Targeted Verification. Targeted Verification refers to targeting specific parts to “verify” with the gold standard based on the outcome of previous phases in the sequential plan. This plan can dramatically reduce the number of patients that need to be verified while attaining performance similar to that of plans that verify all patients and avoiding the pitfalls of plans that verify no patients. Chapter 3 develops a set of closed form estimates that avoid making subjective assumptions and thus have relevant theoretical properties but retain competitive empirical performance. Chapter 4 takes the Targeted Verification concept and adapts it to the second scenario where a BMS is currently in use. It incorporates the information that is previously available about the BMS and takes advantage of the availability of patients previously categorized by the BMS in sampling. It shows that the Targeted Verification concept is much more efficient than similar plans that would verify all subjects, and much more reliable than plans than do not use a gold standard. Chapter 5 develops a set of estimates with a design philosophy the same as that of Chapter 3. To incorporate the design elements of Chapter 4, the new estimates are no longer closed form, but still avoid making subjective assumptions. The estimates have relevant theoretical properties and competitive empirical performance. Chapter 6 summarizes and discusses the findings of the thesis. It also provides directions for future work that make use of the Targeted Verification concept.