Dataset: Gold standard dataset for explainability need detection in app reviews.

Obaidi, Martin

doi:10.5281/zenodo.13273192

Published September 13, 2024 | Version v2

Dataset Open

Dataset: Gold standard dataset for explainability need detection in app reviews.

Obaidi, Martin (Contact person)¹

1. Leibniz Universität Hannover

We crawled 90,000 app reviews from both Google Play Store and Apple App Store, including reviews from both free and paid apps. These reviews were filtered for explainability needs, and after this process, 4,495 reviews remained. Among them, 2,185 reviews indicated an explanation need, while 2,310 did not. This resulting gold standard dataset was used to train and evaluate several machine learning models and rule-based approaches for detecting explanation needs in app reviews.

The dataset includes both balanced and unbalanced evaluation sets, as well as the original crawled data from October 2023. In addition to machine learning approaches, rule-based methods optimized for F1 score, precision, and recall are also included.

We provide several pre-trained machine learning models (including BERT, SetFit, AdaBoost, K-Nearest Neighbor, Logistic Regression, Naive Bayes, Random Forest, and SVM) along with training scripts and evaluation notebooks. These models can be applied directly or retrained using the included datasets.

For further details on the structure and usage of the dataset, please refer to the README.md file within the provided ZIP archive.

Files

automated_explain_detection-v02.zip

Files (871.7 MB)

Name	Size	Download all
automated_explain_detection-v02.zip md5:df7ef758a051561e1ee7b9d1878b8856	871.7 MB	Preview Download

	All versions	This version
Views	174	134
Downloads	25	19
Data volume	26.2 GB	19.2 GB

Dataset: Gold standard dataset for explainability need detection in app reviews.

Creators

Description

Files

automated_explain_detection-v02.zip

Files (871.7 MB)