Published November 1, 2020 | Version v1
Video/Audio Open

ASPIRE - Real noisy audio-visual speech enhancement corpus

  • 1. Edinburgh Napier University
  • 2. University of Wolverhampton

Description

What is ASPIRE?

ASPIRE is a a first of its kind, audiovisual speech corpus recorded in real noisy environment (such as cafe, restaurants) which can be used to support reliable evaluation of multi-modal Speech Filtering technologies. This dataset follows the same sentence format as the audio-visual Grid corpus. The recorded audiovisual speech corpus can be used for reliable evaluation of next generation multi-modal Speech Filtering technologies.

Citing the corpus

@article{gogate2020cochleanet,
  title={CochleaNet: A robust language-independent audio-visual model for real-time speech enhancement},
  author={Gogate, Mandar and Dashtipour, Kia and Adeel, Ahsan and Hussain, Amir},
  journal={Information Fusion},
  volume={63},
  pages={273--285},
  year={2020},
  publisher={Elsevier}
}

OR

Gogate, M., Dashtipour, K., Adeel, A., & Hussain, A. (2020). CochleaNet: A robust language-independent audio-visual model for real-time speech enhancement. Information Fusion63, 273-285.

Acknowledgements

This research was funded by the UK Engineering and Physical Sciences Research Council (EPSRC project AV-COGHEAR, EP/M026981/1)

Files

ASPIRE.zip

Files (13.0 GB)

Name Size Download all
md5:0b597f4ed2774c8af6c0332ec16cf1fa
13.0 GB Preview Download

Additional details

Related works

Is supplemented by
Journal article: 10.1016/j.inffus.2020.04.001 (DOI)

Funding

Towards visually-driven speech enhancement for cognitively-inspired multi-modal hearing-aid devices (AV-COGHEAR) EP/M026981/1
UK Research and Innovation

References

  • Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Amir Hussain, CochleaNet: A robust language-independent audio-visual model for real-time speech enhancement, Information Fusion, Volume 63, 2020, Pages 273-285, ISSN 1566-2535, https://doi.org/10.1016/j.inffus.2020.04.001.