There is a newer version of the record available.

Published June 19, 2024 | Version v4
Dataset Open

WONDERBREAD: A Benchmark + Dataset for Business Process Management (BPM) Tasks

  • 1. ROR icon Stanford University

Description

Paper: WONDERBREAD: A Benchmark for Evaluating Multimodal Foundation Models on Business Process Management Tasks

Background

The WONDERBREAD dataset contains 2,928 human demonstrations of 598 web navigation workflows across 6 types of BPM tasks. These tasks measure the ability of a model to generate accurate documentation, assist in knowledge transfer, and improve  the effeciency of workflows.

Please see our website for more details: https://wonderbread.stanford.edu/

Quick Start

To start, download debug_demos.zip (~1 GB). It contains a subset of 24 demonstrations which can give you a sense of how the dataset is structured.

To reproduce the paper, download gold_demos.zip (~33 GB). It contains 724 demonstrations corresponding to the 162 "Gold" tasks which were used for all the evaluations in the original paper.

To obtain the full dataset, download demos.zip (~133 GB). This contains all 2,928 demonstrations and can be used for training, fine-tuning, and evaluating models.

Dataset Structure

The dataset contains several files, defined below.

  1. Raw Data (useful for training/fine-tuning/evaluation)
    1. debug_demos.zip -- a subset of only 24 demonstrations taken from the full dataset. Useful to get a sense of the dataset and for debugging.
    2. gold_demos.zip -- a subset of only 724 demonstrations corresopnding to the 162 "Gold" tasks. This is the dataset that was used for all evaluations in the original WONDERBREAD paper.
    3. demos.zip -- all 2,928 demonstrations across 598 tasks. Useful for training your own models.
  2. Evaluation (useful for evaluation)
    1. qa_dataset.csv -- contains all 120 questions and ground truth answers used in the "Knowlege Transfer" evaluation.
    2. df_rankings.csv -- contains the rankings of all "Gold" tasks used in the "SOP Ranking" evaluation.
  3. Metadata (can be safely ignored)
    1. Process Mining Task Demonstrations.xlsx -- maps human annotators to specific demonstrations; also contains "Gold" task rankings used in the "SOP Ranking" evaluation.
    2. metadata.json -- maps Google Drive URLs to Google Drive Folder IDs to demonstration names
    3. df_valid.csv -- tracks assets associated with each demonstration

Files

metadata.json

Files (166.6 GB)

Name Size Download all
md5:3798bda329c62001fa29121d0f41a7ac
943.0 MB Preview Download
md5:38b1ebccf78c0ffe99a315e1b46fb769
132.7 GB Preview Download
md5:f8c3dea8dba4a1be882162c7b8e21f98
109.6 kB Preview Download
md5:1eccae55662f0955c246e4175d76da16
324.6 kB Preview Download
md5:521fe91eb623e7cc681de762458219a2
33.0 GB Preview Download
md5:1bb51a21dd0ab98a1f29b46d5ac11509
793.9 kB Preview Download
md5:6aea20f602ce1bbbc1a5773fb0204e71
586.2 kB Download
md5:5fe03543494a2d8638a7a91d573202d4
42.4 kB Preview Download

Additional details

Related works

Is published in
Publication: arXiv:2406.13264 (arXiv)

Dates

Updated
2024-07-06

Software

Repository URL
https://github.com/HazyResearch/wonderbread
Development Status
Active