Published May 10, 2020 | Version v1
Thesis Open

Towards a More Refined Training Process for Neural Networks: Applying Layer-wise Relevance Propagation to Understand and Improve Classification Performance on Imbalanced Datasets

  • 1. Fraunhofer HHI; Technische Universität Berlin
  • 1. Fraunhofer HHI
  • 2. Technische Universität Berlin

Description

In this thesis, we consider the problem of neural network (NN) training on imbalanced datasets. The generalization performance of state-of-the-art Deep Neural Networks heavily relies on the dataset composition, and class imbalance can lead to undesirable effects such as overfitting on majority classes or disregard of minority classes. Furthermore, imbalanced datasets are not a rare case, especially when the data is taken from a real-world setting. To mitigate their negative effects on state-ofthe-art networks, the training behavior of a neural network on imbalanced data is analyzed and adapted with the goal of achieving a more balanced performance, i.e. with the aim to obtain an improved generalization. For this purpose, heatmaps generated via Layer-wise Relevance Propagation (LRP) (Bach et al., 2015) are employed as a measure of (class-wise) understanding. As an example case, the extremely popular and widely used VGG-16 model (Simonyan and Zisserman, 2014) is chosen as the network architecture and the Adience benchmark (Eidinger, Enbar, and Hassner, 2014) as an imbalanced real-world dataset. We start with a short overview over the methods relevant for this thesis, i.e., established approaches for dealing with imbalanced datasets when training neural networks and explainability methods, especially LRP. In a first experiment, the usage of LRP-heatmaps as a measure of understanding is then validated by determining their correlation to traditional evaluation metrics, e.g., test accuracy and F1-score. For this purpose, secondary metrics are derived from the LRP-heatmaps, to enable the comparison to the traditional evaluation metrics. A strong connection between the LRP-heatmap-based metrics and the traditional evaluation metrics is uncovered here. Moreover, we find that the LRP-heatmap-based metrics are indicative of how well the classes separate w.r.t. the prediction function f(x). These properties motivate the second experiment, where the LRP-heatmaps and the secondary metrics derived from them are applied practically for making class-wise adaptations to the NNs’ training behavior. Here, the expressivity of LRP-heatmaps is demonstrated for this purpose by achieving more balanced class-wise performances with training-improving methods based on LRP, since, e.g., the derived methods force the neural network model during training to immediately consider previously neglected classes.

Files

improving_with_lrp_msc_thesis.pdf

Files (11.2 MB)

Name Size Download all
md5:014cd15a3ca89ffc5d27ce6e171205b1
11.2 MB Preview Download

Additional details

References

  • Abadi, M., A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. J. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Józefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. G. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. A. Tucker, V. Vanhoucke, V. Vasudevan, F. B. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng (2016). "TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems". In: CoRR abs/1603.04467. Software available from tensorflow.org. arXiv: 1603.04467.
  • Alber, M., S. Lapuschkin, P. Seegerer, M. Hägele, K. T. Schütt, G. Montavon, W. Samek, K.-R. Müller, S. Dähne, and P.-J. Kindermans (2019). "iNNvestigate Neural Networks!" In: Journal of Machine Learning Research 20, 93:1–93:8.
  • Arras, L., F. Horn, G. Montavon, K.-R. Müller, and W. Samek (2017). ""What is Relevant in a Text Document?": An Interpretable Machine Learning Approach". In: PLoS ONE 12.8, pp. 1–23.
  • Arulkumaran, K., A. Cully, and J. Togelius (2019). "AlphaStar: An Evolutionary Computation Perspective". In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO 2019, Prague, Czech Republic, July 13-17, 2019, pp. 314–315.
  • Bach, S., A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek (2015). "On Pixel-wise Explanations for Non-Linear Classifier Decisions by Layer-wise Relevance Propagation". In: PLoS ONE 10.7, pp. 1–46.
  • Bach, S., A. Binder, K.-R. Müller, and W. Samek (2016). "Controlling Explanatory Heatmap Resolution and Semantics via Decomposition Depth". In: 2016 IEEE International Conference on Image Processing, ICIP 2016, Phoenix, AZ, USA, September 25-28, 2016, pp. 2271–2275.
  • Baehrens, D., T. Schroeter, S. Harmeling, M. Kawanabe, K. Hansen, and K.-R. Müller (2010). "How to Explain Individual Classification Decisions". In: Journal of Machine Learning Research 11, pp. 1803–1831.
  • Bharadhwaj, H. (2018). "Layer-wise Relevance Propagation for Explainable Recommendations". In: CoRR abs/1807.06160. arXiv: 1807.06160.
  • Brahma, P. P., D. Wu, and Y. She (2016). "Why Deep Learning Works: A Manifold Disentanglement Perspective". In: IEEE Transactions on Neural Networks and Learning Systems 27.10, pp. 1997–2008.
  • Brinker, T. J., A. Hekler, J. S. Utikal, N. Grabe, D. Schadendorf, J. Klode, C. Berking, T. Steeb, A. H. Enk, and C. von Kalle (2018). "Skin Cancer Classification Using Convolutional Neural Networks: Systematic Review". In: Journal of Medical Internet Research 20.10, e11936.
  • Chawla, N. V., K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer (2002). "SMOTE: Synthetic Minority Over-Sampling Technique". In: Journal of Artificial Intelligence Research 16, pp. 321–357.
  • Chollet, F. et al. (2015). Keras. https://keras.io.
  • Chung, Y., H. Lin, and S. Yang (2016). "Cost-Aware Pre-Training for Multiclass CostSensitive Deep Learning". In: Proceedings of the Twenty-Fifth International Joint Bibliography 77 Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, July 9-15, 2016, pp. 1411–1417.
  • Deng, J., W. Dong, R. Socher, L. Li, K. Li, and F. Li (2009). "ImageNet: A Large-Scale Hierarchical Image Database". In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), June 20-25, 2009, Miami, Florida, USA, pp. 248–255.
  • Deng, L., G. E. Hinton, and B. Kingsbury (2013). "New Types of Deep Neural Network Learning for Speech Recognition and Related Applications: An Overview". In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pp. 8599–8603.
  • Dosovitskiy, A. and T. Brox (2016). "Inverting Visual Representations with Convolutional Networks". In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp. 4829–4837.
  • Drummond, C. and R. Holte (2003). "C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling Beats Over-Sampling". In: Workshop on Learning from Imbalanced Datasets II. Citeseer, pp. 1–8.
  • Eidinger, E., R. Enbar, and T. Hassner (2014). "Age and Gender Estimation of Unfiltered Faces". In: IEEE Transactions on Information Forensics and Security 9.12, pp. 2170–2179.
  • Fong, R. C. and A. Vedaldi (2017). "Interpretable Explanations of Black Boxes by Meaningful Perturbation". In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pp. 3449–3457.
  • Funahashi, K. (1989). "On the Approximate Realization of Continuous Mappings by Neural Networks". In: Neural Networks 2.3, pp. 183–192.
  • Guo, X., Y. Yin, C. Dong, G. Yang, and G. Zhou (2008). "On the Class Imbalance Problem". In: 2008 Fourth International Conference on Natural Computation. Vol. 4, pp. 192–201.
  • He, H. and E. A. Garcia (2009). "Learning from Imbalanced Data". In: IEEE Transactions on Knowledge and Data Engineering 21.9, pp. 1263–1284.
  • Kalash, M., M. Rochan, N. Mohammed, N. D. B. Bruce, Y. Wang, and F. Iqbal (2018). "Malware Classification with Deep Convolutional Neural Networks". In: 9th IFIP International Conference on New Technologies, Mobility and Security, NTMS 2018, Paris, France, February 26-28, 2018, pp. 1–5.
  • Khan, S. H., M. Hayat, M. Bennamoun, F. A. Sohel, and R. Togneri (2018). "CostSensitive Learning of Deep Feature Representations From Imbalanced Data". In: IEEE Transactions on Neural Networks and Learning Systems 29.8, pp. 3573–3587.
  • Kindermans, P.-J., K. T. Schütt, M. Alber, K.-R. Müller, D. Erhan, B. Kim, and S. Dähne (2018). "Learning How to Explain Neural Networks: PatternNet and PatternAttribution". In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018.
  • Kohlbrenner, M., A. Bauer, S. Nakajima, A. Binder, W. Samek, and S. Lapuschkin (2019). "Towards Best Practice in Explaining Neural Network Decisions with LRP". In: CoRR abs/1910.09840. arXiv: 1910.09840.
  • Kukar, M. and I. Kononenko (1998). "Cost-Sensitive Learning with Neural Networks". In: ECAI Proceedings, pp. 445–449.
  • Lapuschkin, S. (2019). "Opening the Machine Learning Black Box with Layer-wise Relevance Propagation". PhD thesis. Technische Universität Berlin.
  • Lapuschkin, S., A. Binder, K.-R. Müller, and W. Samek (2017). "Understanding and Comparing Deep Neural Networks for Age and Gender Classification". In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1629–1638.
  • LeCun, Y., L. Bottou, Y. Bengio, and P. Haffner (1998). "Gradient-Based Learning Applied to Document Recognition". In: Proceedings of the IEEE 86.11, pp. 2278– 2324.
  • LeCun, Y., L. Bottou, G. B. Orr, and K. Müller (2012). "Efficient BackProp". In: Neural Networks: Tricks of the Trade - Second Edition. Ed. by G. Montavon, G. B. Orr, and K. Müller. Vol. 7700. Lecture Notes in Computer Science. Springer, pp. 9–48.
  • Levi, G. and T. Hassner (2015). "Age and Gender Classification Using Convolutional Neural Networks". In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2015, Boston, MA, USA, June 7-12, 2015, pp. 34–42.
  • Li, H., J. Li, P.-C. Chang, and J. Sun (2013). "Parametric Prediction on Default Risk of Chinese Listed Tourism Companies by Using Random Oversampling, Isomap, and Locally Linear Embeddings on Imbalanced Samples". In: International Journal of Hospitality Management 35, pp. 141–151.
  • Lin, H. W. and M. Tegmark (2016). "Why Does Deep and Cheap Learning Work so Well?" In: CoRR abs/1608.08225. arXiv: 1608.08225.
  • Lin, T., P. Goyal, R. B. Girshick, K. He, and P. Dollár (2020). "Focal Loss for Dense Object Detection". In: IEEE Transactions on Pattern Analysis and Machine Intelligence 42.2, pp. 318–327.
  • Liu, X., J. Wu, and Z. Zhou (2009). "Exploratory Undersampling for Class-Imbalance Learning". In: IEEE Transactions on Systems, Man, and Cybernetics: Systems, Part B 39.2, pp. 539–550.
  • Liu, Y., E. Racah, Prabhat, J. Correa, A. Khosrowshahi, D. Lavers, K. Kunkel, M. F. Wehner, and W. D. Collins (2016). "Application of Deep Convolutional Neural Networks for Detecting Extreme Weather in Climate Datasets". In: CoRR abs/1605.01156. arXiv: 1605.01156.
  • Montavon, G., A. Binder, S. Lapuschkin, W. Samek, and K.-R. Müller (2019). "Layerwise Relevance Propagation: An Overview". In: Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, pp. 193–209.
  • Montavon, G., S. Lapuschkin, A. Binder, W. Samek, and K.-R. Müller (2017). "Explaining Nonlinear Classification Decisions with Deep Taylor Decomposition". In: Pattern Recognition 65, pp. 211–222.
  • Muhammad, K., J. Ahmad, I. Mehmood, S. Rho, and S. W. Baik (2018). "Convolutional Neural Networks Based Fire Detection in Surveillance Videos". In: IEEE Access 6, pp. 18174–18183.
  • Ribeiro, M. T., S. Singh, and C. Guestrin (2016). ""Why Should I Trust You?": Explaining the Predictions of Any Classifier". In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, pp. 1135–1144.
  • Rothe, R., R. Timofte, and L. V. Gool (2015). "DEX: Deep EXpectation of Apparent Age from a Single Image". In: IEEE International Conference on Computer Vision Workshop, ICCV Workshops 2015, Santiago, Chile, December 7-13, 2015, pp. 252–257.
  • Rothe, R., R. Timofte, and L. V. Gool (2018). "Deep Expectation of Real and Apparent Age from a Single Image without Facial Landmarks". In: International Journal of Computer Vision 126.2-4, pp. 144–157.
  • Ruseti, S., M. Dascalu, A. M. Johnson, D. S. McNamara, R. Balyan, K. S. McCarthy, and S. Trausan-Matu (2018). "Scoring Summaries Using Recurrent Neural Networks". In: Intelligent Tutoring Systems - 14th International Conference, ITS 2018, Montreal, QC, Canada, June 11-15, 2018, Proceedings, pp. 191–201.
  • Samek, W, A Binder, G Montavon, S Lapuschkin, and K.-R. Müller (2017). "Evaluating the Visualization of What a Deep Neural Network Has Learned." In: IEEE Transactions on Neural Networks and Learning Systems 28.11, pp. 2660–2673.
  • Segler, M. H., M. Preuss, and M. P. Waller (2018). "Planning Chemical Syntheses with Deep Neural Networks and Symbolic AI". In: Nature 555.7698, p. 604.
  • Selvaraju, R. R., M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra (2017). "Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization". In: IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pp. 618–626.
  • Selvaraju, R. R., M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra (2020). "Grad-CAM: Visual Explanations from Deep Networks via GradientBased
  • Simonyan, K., A. Vedaldi, and A. Zisserman (2013). "Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps". In: CoRR abs/1312.6034. arXiv: 1312.6034.
  • Simonyan, K. and A. Zisserman (2014). "Very Deep Convolutional Networks for Large-Scale Image Recognition". In: CoRR abs/1409.1556. arXiv: 1409.1556.
  • Sixt, L., M. Granz, and T. Landgraf (2019). "When Explanations Lie: Why Modified BP Attribution Fails". In: CoRR abs/1912.09818. arXiv: 1912.09818.
  • Springenberg, J. T., A. Dosovitskiy, T. Brox, and M. A. Riedmiller (2015). "Striving for Simplicity: The All Convolutional Net". In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Workshop Track Proceedings.
  • Wang, S., W. Liu, J. Wu, L. Cao, Q. Meng, and P. J. Kennedy (2016). "Training Deep Neural Networks on Imbalanced Data Sets". In: International Joint Conference on Neural Networks, IJCNN 2016, Vancouver, BC, Canada, July 24-29, 2016, pp. 4368– 4374.
  • Yosinski, J., J. Clune, A. M. Nguyen, T. J. Fuchs, and H. Lipson (2015). "Understanding Neural Networks Through Deep Visualization". In: CoRR abs/1506.06579. arXiv: 1506.06579.
  • Zeiler, M. D. and R. Fergus (2014). "Visualizing and Understanding Convolutional Networks". In: Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I, pp. 818–833.
  • Zhou, B., A. Khosla, À. Lapedriza, A. Oliva, and A. Torralba (2016). "Learning Deep Features for Discriminative Localization". In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp. 2921–2929.
  • Zintgraf, L. M., T. S. Cohen, T. Adel, and M. Welling (2017). "Visualizing Deep Neural Network Decisions: Prediction Difference Analysis". In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings.