Published January 2, 2023 | Version 0.0.2
Dataset Open

LJ Speech - Aligned IPA transcriptions

  • 1. Chemnitz University of Technology

Description

Files:

  • grids.zip

    • contains TextGrids for all audio files containing three tiers wordsphonemes and transcription
      • words contains the aligned normalized English words
      • phonemes contains IPA pronunciations transcribed using CMU dictionary which then were aligned with Montreal Forced Aligner. The pronunciations were then mapped from ARPAbet to IPA and duration marks were applied (without punctuation)
      • transcription contains unaligned phonemes including punctuation and word boundary labels (SIL0)
  • preview.png

    • preview of the first TextGrid opened in Praat
  • words-vocabulary.txt

    • contains all words from tier words
  • phonemes-vocabulary.txt

    • contains all phonemes from tier phonemes
  • transcription-vocabulary.txt

    • contains all phonemes/punctuation from tier transcription
  • phonemes-durations.pdf

    • contains the plotted phoneme duration distribution of tier phonemes
  • phonemes-durations-simple.pdf

    • contains the plotted phoneme duration distribution of tier phonemes if all duration markers are ignored
  • pronunciations.dict

    • contains the pronunciations for each word including punctuation and weights (occurrence)
  • script.sh

    • contains the script to reproduce all results

Phoneme duration marker:

  • ˘ -> [0, 20) percentile
  • ˑ -> [80, 90) percentile
  • ː -> [90, inf) percentile

Silence marker:

  • SIL0 -> no silence
  • SIL1 -> [0, 33.33) percentile
  • SIL2 -> [33.33, 66.66) percentile
  • SIL3 -> [66.66, inf) percentile

 

Notes

Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410

Files

preview.png

Files (37.8 MB)

Name Size Download all
md5:009e100ae3ad8523dc6b153a21bcba9a
33.9 MB Preview Download
md5:32cf66e25111cafd8c2abf10f2f15db2
164.8 kB Preview Download
md5:1394471b64abb690a20da9edd57f95d1
571.7 kB Preview Download
md5:2d38fa06d5c919e44e394a4896ef2bc7
1.8 kB Preview Download
md5:43af2d4563c16e5afcd2cb6c54b237ca
170.7 kB Preview Download
md5:43b3fb61602f69e812055143d39fb41c
2.7 MB Download
md5:db0435c1498850a89e117a662b75c477
31.8 kB Download
md5:729ba8f5ce49336c5d3281605a65f62e
1.8 kB Preview Download
md5:e8b3648b0f8afb2215460436e0108198
217.2 kB Preview Download

Additional details

References