There is a newer version of the record available.

Published September 17, 2022 | Version v1.0
Video/Audio Open

Speech corpus of Armenian question-answer dialogues

  • 1. INALCO
  • 2. Stony Brook University

Description

This is a corpus of elicited controlled speech. The stimuli was a sequence of dialogues with intermittent fillers. This repository is for only the stimuli. The stimuli was designed to elicit intonation patterns for questions and answers in two Armenian dialects: Western Armenian (WA) and Eastern Armenian (EA). The recordings can be used for topics like intonation prosody or ASR (Automatic Speech Recognition).

The dataset is is open-access at 8,852 dialogues, consisting of 23,711 utterances (individual sound files), for a total of 2.7GB and 8.5hrs. Each utterance has a sound file, a Praat TextGrid (with full linguistic annotation), and text file that has orthographic forms for easier ASR uses. Pronunciation dictionaries are provided for ASR purposes as well.

Files

jhdeov/armenian-intonation-v1.0.zip

Files (2.2 GB)

Name Size Download all
md5:705d0533b5017b68aa3a5507c05c7552
2.2 GB Preview Download

Additional details

References