Conference paper Open Access
Mara Graziani; Thomas Lompech; Henning Müller; Vincent Andrearczyk
Visualization methods for Convolutional Neural Net-works (CNNs) are spreading within the medical com-munity to obtain explainable AI (XAI). The sole quali-tative assessment of the explanations is subject to a riskof confirmation bias. This paper proposes a methodol-ogy for the quantitative evaluation of common visual-ization approaches for histopathology images, i.e. ClassActivation Mapping and Local-Interpretable Model-Agnostic Explanations. In our evaluation, we proposeto assess four main points, namely the alignment withclinical factors, the agreement between XAI methods,the consistency and repeatability of the explanations. Todo so, we compare the intersection over union of multi-ple visualizations of the CNN attention with the seman-tic annotation of functionally different nuclei types. Theexperimental results do not show stronger attributions tothe multiple nuclei types than those of a randomly ini-tialized CNN. The visualizations hardly agree on salientareas and LIME outputs have particularly unstable re-peatability and consistency. The qualitative evaluationalone is thus not sufficient to establish the appropriate-ness and reliability of the visualization tools. The codeis available on GitHub atbit.ly/2K48HKz.