Financial Fraud Detection and Data Mining of Imbalanced Databases using State Space Machine Learning

Sawh, Deitra

Financial Fraud Detection and Data Mining of Imbalanced Databases using State Space Machine Learning

Files

Sawh_Deitra.pdf (2.01 MB)

Date

2016-01-04

Authors

Sawh, Deitra

Advisor

Ponnambalam, Kumaraswamy

Publisher

University of Waterloo

Abstract

Risky decisions made by humans exhibit characteristics common to each decision. The related systems experience repeated abuse by risky humans and their actions collude to form a systemic behavioural set. Financial fraud is an example of such risky behaviour. Fraud detection models have drawn attention since the financial crisis of 2008 because of their frequency, size and technological advances leading to financial market manipulation. Statistical methods dominate industrial fraud detection systems at banks, insurance companies and financial marketplaces. Most efforts thus far have focused on anomaly detection problems and simple rules in the academic literature and industrial setting. There are unsolved issues in modeling the behaviour of risky agents in real-world financial markets using machine learning. This research studies the challenges posed by fraud detection, including the problem of imbalanced class distributions, and investigates the use of Reinforcement Learning (RL) to model risky human behaviour. Models have been developed to transform the relevant financial data into a state-space system. Reinforcement Learning agents uncover the decision-making processes by risky humans and derive an optimal path of behaviour at the end of the learning process. States are weighted by risk and then classified as positive (risky) or negative (not-risky). The positive samples are composed of features that represent the hidden information underlying the risky behaviour. Reinforcement Learning is implemented as unsupervised and supervised models. The unsupervised learning agent searches for risky behaviour without any previous knowledge of the data; it is not “trained” on data with true class labels. Instead, the RL learner relates samples through experience. The supervised learner is trained on a proportion (e.g. 90%) of the data with class labels. It derives a policy of optimal actions to be taken at each state during the training stage. One policy is selected from several learning agents and then the model is exposed to the other proportion (e.g. 10%) of data for classification. RL is hybridized with a Hidden Markov Model (HMM) in the supervised learning model to impose a probabilistic framework around the risky agent’s behaviour. We first study an insider trading example to demonstrate how learning algorithms can mimic risky agents. The classification power of the model is further demonstrated by applying it to a real-world based database for debit card transaction fraud. We then apply the models to two problems found in Statistics Canada databases: heart disease detection and female labour force participation. All models are evaluated using appropriate measures for imbalanced class problems: “sensitivity” and “false positive”. Sensitivity measures the number of correctly classified positive samples (e.g. fraud) as a proportion of all positive samples in the data. False positive counts the number of negative samples classified positive as a proportion of all negative samples in the data. The intent is to maximize sensitivity and minimize the false positive rate. All models show high sensitivity rates while exhibiting low false positive rates. These two metrics are ideal for industrial implementation because of high levels of identification at a low cost. Fraud detection rate is the focus with detection rates of 75-85% proving that RL is a superior method for data mining of imbalanced databases. By solving the problem of hidden information, this research can facilitate the detection of risky human behaviour and prevent it from happening.