Published October 30, 2018 | Version v1
Video/Audio Open

The Fharvard corpus

  • 1. University of Grenoble Alpes, CNRS, GIPSA-lab, Grenoble, France
  • 2. University of Konstanz, Konstanz, Germany

Description

The Fharvard corpus is a collection of 700 sentences in French, phonetically balanced into 70 lists of 10 sentences each. Each sentence contains 5 keywords for scoring.

A detailed presentation and evaluation of this dataset can be found at: https://doi.org/10.1016/j.specom.2020.07.004

The list of sentences is contained in the file The Fharvard corpus.pdf with keywords in bold.

The phonetic transcription is provided in The Fharvard corpus - phonetic.txt. The ortho column contains the orthographic representation of the sentence with keywords in capital letters. The phono column contains the phonetic representation in SAMPA coding, with words separated by two successive space characters. Note that the phonetic representation is provided on an individual word basis, that is, discarding word-to-word liaisons. This is to provide an unambiguous basis for phonetic balancing at the keyword level, as the realisation of some liaisons can vary from talker to talker.

Audio recordings of the Fharvard sentences spoken by a female and a male talker are contained in the .zip archive files, and available with a 44.1 kHz and 16 kHz sampling rate.

A sample sentence for the female and the male talker is also attached.

 

 

Files

Audio 16kHz Female.zip

Files (403.4 MB)

Name Size Download all
md5:d2cdabe535076515953cc96b2b80994e
53.8 MB Preview Download
md5:0a3cbe424f4c9bdbc0306712e5d2c939
56.6 MB Preview Download
md5:a0f0838538b9a3527f8d275492b14e19
141.0 MB Preview Download
md5:c534dc13acd06ec2d976e1b66cca7432
151.7 MB Preview Download
md5:4d8da4bba06063c6827400f144223e44
108.5 kB Preview Download
md5:7ae7b5d0f9e6890b9505ce79ed8ee96d
120.2 kB Preview Download
md5:7e164b4cb4b8391befab08781a039ad9
88.2 kB Preview Download
md5:190df22cd5a9962dc7379d10d6496c4d
90.1 kB Preview Download

Additional details

Related works

Is cited by
Journal article: 10.1016/j.specom.2020.07.004 (DOI)

Funding

SPEECH UNIT(E)S – The multisensory-motor unity of speech 339152
European Commission