There is a newer version of the record available.

Published June 19, 2024 | Version v1
Dataset Restricted

DISPLACE-2024 Dataset

Description

Inspired by the previous session of DISPLACE 2023 challenge, we have  launched the DISPLACE 2024 challenge  (https://displace2024.github.io/). Compared to the first DISPLACE challenge, the current challenge  includes an additional track on automatic speech recognition (ASR) in code-switched multi-accent  conversational scenarios along with speaker and language diarization tracks. We  release  supervised data for exploring new directions on multilingual multispeaker, multi accent conversational data. To the best of our knowledge, no publicly available dataset matches the diverse characteristics observed in the DISPLACE dataset, including code-mixing/switching, natural overlaps, reverberation, and noise. For this challenge, a natural multi-lingual, multi-speaker conversational dataset will be distributed for development and evaluation purposes. There will be no training data given and the participants will be free to use any resource for training the models. The challenge reflects the theme of Interspeech 2024 - "Speech and Beyond" in its true sense.



The dataset can be obtained by orgainisers by sending the request form and duly signed the terms and condtions

Link for registration for obtaining the data : Registration Link

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Additional details

Additional titles

Alternative title
The Second DISPLACE Challenge data

Software

Repository URL
https://github.com/displace2024/Displace2024_baseline
Programming language
Shell, Python