Published May 30, 2023 | Version v2
Dataset Open

MeetingBank: A Benchmark Dataset for Meeting Summarization

Description

MeetingBank, a benchmark dataset created from the city councils of 6 major U.S. cities to supplement existing datasets. It contains 1,366 meetings with over 3,579 hours of video, as well as transcripts, PDF documents of meeting minutes, agenda, and other metadata. On average, a council meeting is 2.6 hours long and its transcript contains over 28k tokens, making it a valuable testbed for meeting summarizers and for extracting structure from meeting videos. The datasets contains 6,892 segment-level summarization instances for training and evaluating of performance.

Files

MeetingBank.zip

Files (637.1 MB)

Name Size Download all
md5:9c8303b01d25639388199ef625ccdb39
637.1 MB Preview Download

Additional details

Related works

Is published in
Conference paper: https://arxiv.org/abs/2305.17529 (URL)