PB2007 French acoustic-articulatory speech database
Creators
- 1. Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab
Description
PB2007 acoustic-articulatory speech dataset
Badin, P.,Bailly G., Ben Youssef A., Elisei F., Savariaux C., Hueber T.
Univ. Grenoble Alpes, CNRS, Grenoble INP, GIPSA-lab, 38000 Grenoble, France
LICENSE:
========
This dataset is made available under the Creative Commons Attribution Share-Alike (CC-BY-SA) license
CREDITS - ATTRIBUTION:
======================
If using this dataset, please cite one of the following studies (all of them exploit this dataset)
- Ben Youssef, A., Badin, P., Bailly, G. & Heracleous, P. (2009). Acoustic-to-articulatory inversion using speech recognition and trajectory formation based on phoneme hidden Markov models. In Interspeech 2009, vol., pp. 2255-2258. Brighton, UK.
- Ben Youssef, A., Badin, P. & Bailly, G. (2010). Can tongue be recovered from face? The answer of data-driven statistical models. In Interspeech 2010 (11th Annual Conference of the International Speech Communication Association) (T. Kobayashi, K. Hirose & S. Nakamura, editors), vol., pp. 2002-2005. Makuhari, Japan.
- Hueber T., Bailly G., Badin P., Elisei F., "Speaker Adaptation of an Acoustic-Articulatory Inversion Model
using Cascaded Gaussian Mixture Regressions", Proceedings of Interspeech, Lyon, France, 2013, pp. 2753-2757.
DATA FILES DESCRIPTION:
=======================
/_seq/:
Electro-magnetic Articulography data, recorded at 100Hz
Sensors :
PAR01 : LT_x (lower incisor, x coordinate)
PAR02 : tip_x (tongue tip, x coordinate)
PAR03 : mid_x (tongue dorsum, x coordinate)
PAR04 : bck_x (tongue back, x coordinate)
PAR05 : LL_vis_x (lower lips, x coordinate)
PAR06 : UL_vis_x (upper lips, x coordinate)
PAR07 : LT_z (lower incisor, z coordinate)
PAR08 : tip_z (tongue tip, z coordinate)
PAR09 : mid_z (tongue dorsum, z coordinate)
PAR10 : bck_z (tongue back, z coordinate)
PAR11 : LL_vis_z (lower lips, z coordinate)
PAR12 : UL_vis_z (upper lips, z coordinate)
/_wav16:
subject audio signal, synchronized with the EMA data
Format: PCA wav, 16kHz, 16bits
/_lab: phonetic segmentation using the following set
__ (long pause), _ (short pause), a, e^ (as in "lait"), e (as in "blé"), i, y (as in "voiture"), u (as in "loup"), o^ (as in "pomme"),x (as in "pneu"), x^ (as in "coeur"), a~ (as in "flan"), e~ (as in "in"), x~ (as in "un"), o~ (as in "mon"), p, t, k, f, s, s^ (as in "CHat"), b, d, g, v, z, z^ (as in "les Gens"), m, n, r, l, w, h, j, o, q (schwa)
Files
PB2007.zip
Files
(37.8 MB)
Name | Size | Download all |
---|---|---|
md5:af891c214c730071348bb0851be7ead1
|
37.8 MB | Preview Download |
md5:95382ee594f73382631e65d7a5b7801c
|
2.5 kB | Download |