Published April 23, 2020 | Version v1
Dataset Open

RSL2019: A Realistic Speech Localization Corpus

  • 1. Department of Electrical Engineering, Indian Institute of Technology Dharwad, Dharwad, India
  • 2. Department of Electrical and Computer Engineering, National University of Singapore, Singapore

Description

We present a new database for speech localization that we refer to as Realistic Speech Localization 2019 (RSL2019) corpus. The corpus is designed for the study of sound source localization in real-world applications. The RSL2019 corpus is a continuing effort, which presently contains 22.60 hours of speech data, recorded using a four channel microphone array, and played over a loudspeaker from different directions of arrival (DOA). We consider 180speech utterances spoken by 6 speakers, selected from RSR2015database, which are played over the loudspeaker positioned at different angles and distances from the microphone array. We vary the DOA from 0 to 360 degree angle at an interval of 5degree, at 1 metre and 1.5 metre distance. From each position and DOA, we also record white noise to study the robustness, and time stretched pulse to generate the transfer function for speech localization algorithm. Furthermore, we present the experimental results and analysis on state-of-the-art sound source localization algorithm using the open source HARK toolkit on the created RSL2019 database. This database is provided for research purpose only.

If you use this database, please cite the following paper.

Rohan Sheelvant, Bidisha Sharma, Maulik Madhavi, Rohan Kumar Das, S.R.M. Prasanna and Haizhou Li, "RSL2019: A Realistic Speech Localization Corpus," 2019 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA), 2019, pp. 1-6, doi: 10.1109/O-COCOSDA46868.2019.9060842.

 

.

Files

Files (18.5 GB)

Name Size Download all
md5:dd52ddd1fd51211cbe4919e9ecf7b401
18.5 GB Download