Published August 30, 2024 | Version v1
Conference paper Open

Fast and Accurate Regional Effect Plots for Automated Tabular Data Analysis

  • 1. ROR icon Athena Research and Innovation Center In Information Communication & Knowledge Technologies
  • 2. Harokopio University of Athens
  • 3. ROR icon Universität der Bundeswehr München

Description

The regional effect is a novel explainability method that can be used for automated tabular data understanding through  a three-step procedure; a black-box machine learning model is trained on a tabular dataset, a regional effect method explains the ML model and the explanations are used to understand the data and and support decision making. Regional effect methods explain the effect of each feature of the dataset on the output within different subgroups, for example, how the age (feature) affects the annual income (output) for men and women separately (subgroups). Identifying meaningful subgroups is computationally intensive, and current regional effect methods face efficiency challenges. In this paper, we present regional RHALE (r-RHALE), a novel regional effect method designed for enhanced efficiency, making it particularly suitable for decision-making scenarios involving large datasets, i.e., with numerous instances or high dimensionality, and complex models such as deep neural networks. Beyond its efficiency, r-RHALE handles accurately tabular datasets with highly correlated features. We showcase the benefits of r-RHALE through a series of synthetic examples, benchmarking it against other regional effect methods. The accompanying code for the paper is publicly available.

Files

paperVLDB_workshop.pdf

Files (911.1 kB)

Name Size Download all
md5:109dd55d43394067008ea4c95ab7476c
911.1 kB Preview Download

Additional details

Funding

European Commission
AI-DAPT – AI-Ops Framework for Automated, Intelligent and Reliable Data/AI Pipelines Lifecycle with Humans-in-the-Loop and Coupling of Hybrid Science-Guided and AI Models 101135826