Published May 23, 2026 | Version v1
Dataset Open

Multidimensional Gold-Standard Dataset for Explanation Needs in App Reviews

  • 1. ROR icon Leibniz University Hannover

Description

Overview

This dataset provides a multidimensional gold-standard dataset for identifying and categorizing explanation needs in mobile app reviews. It contains app metadata, cleaned review candidate datasets, manual annotation exports, consolidated gold-standard labels, interrater-agreement files, and reusable training, validation, and test splits derived from the gold-standard annotations where applicable.

The dataset is intended to support research on explanation need detection, requirements engineering, app review analysis, software engineering, and app-review classification using machine learning and large language models.

Dataset contents

The dataset is organized as follows:

  • data/01_app_metadata/: raw and cleaned app metadata collected from Apple App Store and Google Play Store.
  • data/02_cleaned_review_candidates/: cleaned and quality-controlled English review candidate datasets, including reviews with explanation-need indicators, mixed review candidates, and reviews without explanation-need indicators.
  • data/03_annotations/: manual annotation data, including individual rater annotations, interrater-agreement files, and consolidated gold-standard labels.
  • data/04_training_splits/: prepared training, validation, and test splits for explanation-need marking and taxonomy classification tasks, where applicable.
  • docs/: documentation files describing the taxonomy, label schema, annotation guidelines, preprocessing pipeline, and file organization.

The cleaned review candidate datasets in data/02_cleaned_review_candidates/ are intermediate candidate pools and should not be interpreted as final gold-standard labels. In particular, en_explanation_mixed contains candidate reviews used during the preparatory annotation and calibration phase, including reviews from the independent practice annotation round. These reviews were used to establish a shared understanding of the taxonomy and to identify early disagreements. They are not part of the final 5,004-review gold-standard dataset. The final gold-standard labels are provided under data/03_annotations/gold_labels/.

Annotation and taxonomy

The dataset distinguishes explicit explanation needs, implicit explanation needs, and reviews without explanation needs. For reviews containing explanation needs, the dataset further supports multidimensional categorization using the following taxonomy dimensions:

  • Time Aspect
  • Unexpected System Behavior - Bug
  • Software Feature
  • System Aspect

The human-readable taxonomy is provided in docs/taxonomy.md. A machine-readable label schema is provided in docs/label_schema.json. Annotation rules are documented in docs/annotation_guidelines.md.

File formats

The dataset uses open and widely supported formats:

  • .csv for tabular data,
  • .parquet for efficient tabular storage,
  • .json and .jsonl for structured metadata and training examples,
  • .xlsx for annotation exports and agreement sheets.

Suggested starting points

  • Use data/03_annotations/gold_labels/ to inspect the final gold-standard annotations.
  • Use data/03_annotations/rater_annotations/ to inspect individual rater annotations.
  • Use data/03_annotations/interrater_agreement/ to inspect interrater-agreement files.
  • Use data/04_training_splits/ to reuse the prepared training, validation, and test splits where applicable.
  • Use docs/taxonomy.md and docs/label_schema.json to understand the label structure.

Version information

  • Dataset timestamp: 2024-12-20T19-55-42.075Z
  • App metadata timestamp: 2024-11-30T23-33-36.178Z
  • Release date: 2026-05-23
  • Version: 1.0

Creator

Martin Obaidi

License

This dataset is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to share and adapt the material, including for commercial purposes, provided that appropriate credit is given.

Citation

If you use this dataset, please cite it as:

Martin Obaidi. Multidimensional Gold-Standard Dataset for Explanation Needs in App Reviews. Zenodo. https://doi.org/10.5281/zenodo.20359756

Files

multidimensional-gold-standard-explanation-needs-app-reviews-v1.0.zip

Files (1.9 GB)