|Discrimination is an ongoing problem in the insurance industry that persists, regardless of intent, when the insurer blinds the pricing process from socially controversial or legally prohibited input. In this thesis, we contextualize the problem in property and casualty insurance, considering the prevailing legislation in the United States and the European Union.
In Chapter 1 we introduce the problem of discrimination in insurance, and present contemporary legal cases in the United States, along with recent pricing evidence that supports the hypothesis of discrimination in insurance pricing. We contrast the strengths and weaknesses of some anti-discrimination methodologies for a continuous response variable, from theoretical and practical viewpoints. This introduction opens the door to four research questions, which we contribute an answer to throughout this thesis.
To ensure that the numerical results of our study are realistic, in Chapter 2 we analyze the largest publicly available database of police-reported motor vehicle traffic accidents in the United States. We describe a methodology for extracting a representative sample during the period 2001-2020, and present some results from an analysis of the data. A nationally representative sample of 1,583,520 people involved in 20 years of fatal and non-fatal accidents is analyzed to examine the effects on the injury severity of motor vehicle occupants. We examine the impact of traditional personal automobile insurance rating factors such as gender, age and previous traffic infractions on serious and fatal injuries. An estimated cost of the accidents is used to highlight the rating factors which have the highest influence in prediction accuracy. These results aid in the calibration of a microsimulation model, presented in Chapter 4.
In Chapter 3 we examine the discrimination-free premium in Lindholm et al. (2022a) within a theoretical causal inference framework, and we consider its societal context, to assess when the pricing formula should be used. We consider the insurance pricing problem through the use of directed acyclic graphs. This particular tool allows us to rigorously define an insurance risk factor in a causal framework. We then use this definition in assessing the appropriate application of the discrimination-free premium through three simplified pricing examples, including a health insurance policy and two personal automobile insurance policies with different coverages. From our findings, we suggest criteria for the application of the discrimination-free premium that is dependent on the risk factors and the social context.
In Chapter 4 we describe a microsimulation model which can generate a simulated population of the United States. It is designed to match in aggregate selected characteristics of the target population. We focus on a 2020 pseudo-population from Wisconsin, which we use to explore personal automobile insurance premium ratings. We contrast four pricing models, in terms of prediction accuracy, and in terms of their discriminatory impact over race, using four different definitions of discrimination proposed in the actuarial and machine learning literature. By adapting definitions for disparate impact and proxy discrimination to a statistical test we show that the traditional assumption of independence between frequency and severity cannot only result in reduced prediction performance, but can also be detrimental to racial minorities.
In Chapter 5 we conclude and present some directions for future research.