Quantifying the Performance of Explainability Algorithms

lin, zhong-qiu

Quantifying the Performance of Explainability Algorithms

Files

Lin_ZhongQiu.pdf (7.01 MB)

Date

2020-05-26

Authors

lin, zhong-qiu

Advisor

Wong, Alexander

Publisher

University of Waterloo

Abstract

Given the complexity of the deep neural network (DNN), DNN has long been criticized for its lack of interpretability in its decision-making process. This 'black box' nature has been preventing the adaption of DNN in life-critical tasks. In recent years, there has been a surge of interest around the concept of artificial intelligence explainability/interpretability (XAI), where the goal is to produce an interpretation for a decision made by a DNN algorithm. While many explainability algorithms have been proposed for peaking into the decision-making process of DNN, there has been a limited exploration into the assessment of the performance of explainability methods, with most evaluations centred around subjective human visual perception of the produced interpretations. In this study, we explore a more objective strategy for quantifying the performance of explainability algorithms on DNNs. More specifically, we propose two quantitative performance metrics: i) \textbf{Impact Score} and ii) \textbf{Impact Coverage}. Impact Score assesses the percentage of critical factors with either strong confidence reduction impact or decision shifting impact. Impact Coverage accesses the percentage overlapping of adversarially impacted factors in the input. Furthermore, a comprehensive analysis using this approach was conducted on several explainability methods (LIME, SHAP, and Expected Gradients) on different task domains, such as visual perception, speech recognition and natural language processing (NLP). The empirical evidence suggests that there is significant room for improvement for all evaluated explainability methods. At the same time, the evidence also suggests that even the latest explainability methods can not produce steady better results across different task domains and different test scenarios.