BSL-Hansard: A parallel, multimodal corpus of English and interpreted British Sign Language data from parliamentary proceedings
Description
BSL-Hansard is a novel open source and multimodal resource composed by combining Sign Language video data in BSL and English text from the official transcription of British parliamentary sessions. This paper describes the method followed to compile BSL-Hansard including time alignment of text using the MAUS (Schiel, 2015) segmentation system, gives some statistics about this dataset, and suggests experiments. These primarily include end-to-end Sign Language-to-text translation, but is also relevant for broader machine translation, and speech and language processing tasks.
This dataset will be useful for translation between BSL and English, or for studies in BSL or English down to the phonetic level.
Files
2020-06-videos.zip
Files
(48.6 GB)
Name | Size | Download all |
---|---|---|
md5:b7c53f7d34a29d082b6bf22c7ffa3f22
|
5.4 GB | Preview Download |
md5:3c3e75ae6097f8f4333522a5103065e1
|
5.8 GB | Preview Download |
md5:2eb35d272c2f8a89d6eea8f772714011
|
5.4 GB | Preview Download |
md5:c89fe21da84ecb4b8f511a5db3347fe6
|
5.2 GB | Preview Download |
md5:43363812317887e6cc3a198a145f083c
|
4.6 GB | Preview Download |
md5:0521d398c05ba8300dfbdfe040652132
|
5.3 GB | Preview Download |
md5:220460fd8064bc2e3aab31a00ec119f2
|
4.9 GB | Preview Download |
md5:1aff20ab50eb59ed718b90715cc22467
|
4.2 GB | Preview Download |
md5:fa77a23fae44ea60ece924eca7c18e3f
|
2.8 GB | Preview Download |
md5:dfe22580ed1e2d1194b41a566ebd0dad
|
4.9 GB | Preview Download |
md5:7987da2fde802614a04434dba6fc6ce2
|
2.9 MB | Preview Download |
md5:3e1864a6c3b118b1c46f64ffb3bc09cf
|
2.6 MB | Preview Download |
md5:142e907b316cc043c5bfc35147665942
|
28.0 MB | Preview Download |
md5:00b071cdfb69b9c5b3a6ac9d5e3202e1
|
7.6 kB | Preview Download |
md5:7b73b83fac980800365793597395d175
|
2.0 MB | Preview Download |
Additional details
Funding
References
- McGill and Saggion (2023)