Published February 25, 2021 | Version v1.0
Dataset Open

First DIHARD Challenge -- System Submissions and Scores

  • 1. Linguistic Data Consortium, University of Pennsylvania
  • 2. Baidu Research
  • 3. Laboratoire de Sciences Cognitives et Psycholinguistique, ENS
  • 4. University of Science and Technology of China
  • 5. Electrical Engineering Department, Indian Institute of Science

Description

This dataset contains all submissions to the First DIHARD Speech Diarization Challenge (DIHARD I) as well as the output of the official scoring tool for these submissions. We are releasing it to the research community to support the development of new evaluation metrics for speech activity detection/diarization and system combination techniques. For more details about the evaluation, please consult the evaluation plan:

   https://zenodo.org/record/1199638

or the evaluation website:

   https://dihardchallenge.github.io/dihard1/index.html

**NOTE** that this release only includes the system RTTMs that teams submitted to the DIHARD I scoring server. It does **NOT** include the DIHARD I evaluation set reference RTTMs and UEMs, which are needed to score the system outputs. These reference RTTM/UEM files are distributed by the Linguistic Data Consortium (LDC) as part of LDC2019S12 and LDC2019S13:

Files

dihard1_all_submissions.zip

Files (103.7 MB)

Name Size Download all
md5:dcd9e3bc2199aa1335301af327cb032f
103.7 MB Preview Download