There is a newer version of this record available.

Dataset Open Access

Jingju a cappella singing dataset part2

Rong Gong; Rafael Caro Repetto; Xavier Serra


This is a jingju (also known as Beijing or Peking opera) a cappella singing audio dataset which consists of 120 arias, accounting for 1265 melodic lines. This dataset is also an extension our existing CompMusic jingju corpora ( and dataset (, for example, Jingju a cappella singing dataset part1 ( Both professional and amateur singers were invited to the dataset recording sessions, and the most common jingju musical elements have been covered. This dataset is also accompanied by metadata per aria and melodic line annotated for automatic singing evaluation research purpose.


文件 Files:

  1. audio files in .wav format, mono
  2. aria and line level metadata
  3. line and syllable time boundaries and labels annotations, in Praat .textgrid format
  4. line and syllable time boundaries and labels annotations, in .txt format
    1. *phrase_char: phrase-level time boundaries, labeled in Mandarin characters
    2. *phrase: phrase-level time boundaries, labeled in Mandarin pinyin
    3. *syllable: syllable-level time boundaries, labeled in Mandarin pinyin


艺术家 Artists:


We invited 5 professional singers from NACTA (National Academy of Chinese Theatre Arts, all of them have rich experience in stage performance and teaching) and another 4 amateur singers from jingju associations in non-art schools to the recording sessions. 


伴奏 Accompaniment:


7 singers (3 professional and 4 amateurs) were singing along with the accompaniment of commercial audio recordings; other 2 professional singers were accompanied by 2 professional jinghu players (NACTA).

数据库的覆盖性,完整性,质量和重复利用性 Coverage, completeness, quality and reusability:

  1. 覆盖性: 数据库包含三个主要的京剧行当 - 老生、旦和净;两个主要声腔 - 西皮和二黄,和一些附属声腔,比如四平调、南梆子;包含所有的有节拍的板式 - 原版、慢板、快板、二六、流水、三眼和他们的变化板式。Coverage: The dataset includes the three main role-types - laosheng, dan and jing; two main shengqiang - xipi and erhuang, and a few auxiliary ones, such as sipingdiao, nanbangzi; the whole range of metered banshi - yuanban, manban, kuaiban, erliu, liushui, sanyan and its three variations.
  2. 完整性: 数据库包含有录音和唱句层级的元数据,由Excel spreadsheet格式保存。对于录音层级,元数据包括唱段名、行当、声腔、板式、是否由京胡伴奏。对于唱句层级,每一句都包含行当、声腔、板式、上下句、唱词和所匹配的MusicXML曲谱(有需要曲谱请联系作者)。Completeness: The dataset contains the metadata of the recordings and annotations both at the recording and the line level, organized in separate spreadsheets. For the recordings, the metadata contains the title of the work in Chinese, role-type, shengqiangbanshi, whether it contains jinghu accompaniment. As for the lines, each of them is annotated with the role-type, shengqiang, banshi, line type, that is, opening or closing, the lyrics for the whole line and the related score in the score collection (available on request).
  3. 质量: 一小部分的录音带有中等程度的房间混响和轻微的背景噪声。其余的录音质量都很好。Quality: A small number of the recordings contain medium room reverberation and minor background noise. However, apart from those, the other recordings are dry, clean and of good quality.
  4. 重复利用性: 所有数据库音频和元数据都由Creative Commons Attribution-NonCommercial 4.0 International方式授权。Reusability: All the audio and metadata files in this dataset are licensed under Creative Commons Attribution-NonCommercial 4.0 International.


标注 Annotation:

数据库包含一部分录音的唱句起始位置和音节起始位置标注,标注格式为Praat TextGrid。唱句标注包含有每一唱句的歌词,此歌词从曲谱提取,并不与实际演唱一致;音节标注包含拼音,经过作者修正,试图与演唱发音一致。标注的统计如下:

  • 老生唱句数量,音节数量,音节平均时长 (秒),音节时长标准差 (秒): 405, 3941, 1.32, 2.15
  • 旦唱句数量, 音节数量, 音节平均时长 (秒), 音节时长标准差 (秒): 467, 4394, 1.63, 3.25
  • 总体唱句数量, 音节数量, 音节平均时长 (秒), 音节时长标准差 (秒): 872, 8335, 1.48, 2.79

The dataset contains the line and syllable boundary annotation for a part of recordings, in Praat TextGrid format. The line annotation contains the lyrics for each line, which is extracted from the score, and might not coherent with the actual singing; the syllable annotation contains pinyin, corrected by the author to be coherent with the actual singing. The statistics of the annotation are:

  • laosheng num. of lines, num. of syllables, average syllable duration (s), standard deviation (s): 405, 3941, 1.32, 2.15
  • dan num. of lines, num. of syllables, average syllable duration (s), standard deviation (s): 467, 4394, 1.63, 3.25
  • Overall num. of lines, num. of lines, num. of syllables, average syllable duration (s), standard deviation (s): 872, 8335, 1.48, 2.79


引用 Citation:


For more information, please refer the following publication and If you use this dataset in your work, please cite the following publication:

Rong Gong, Rafael Caro Repetto, Xavier Serra, “Creating an A Cappella Singing Audio Dataset for Automatic Jingju Singing Evaluation Research,” in 4th International Digital Libraries for Musicology workshop (DLfM 2017), Shanghai, China.


协议 License:

Creative Commons Attribution-NonCommercial 4.0


联系方式 Contact information:

如果任何问题,请联系作者 If you have any question, please contact the authors:

龚嵘 Rong Gong: Email - rong<dot>gong<at>upf<dot>edu, Wechat id - gongr86

贵云飞 Rafael Caro Repetto: Email - rafael<dot>caro<at>upf<dot>edu 


如果您想联系京剧演员 If you want to contact the jingju singers:

廖佳尼 Jiani Liao: Wechat id - v1307624197

邵雅昆 Yakun Shao: Wechat id - S_yakun-


或京胡乐手 Or jinghu players:

张蓝天 Lantian Zhang: Wechat id - tian576632395

Files (6.0 GB)
Name Size
158.3 kB Download
297.3 kB Download
145.4 kB Download
6.0 GB Download
All versions This version
Views 2,047433
Downloads 6,076526
Data volume 16.3 TB2.3 TB
Unique views 1,708418
Unique downloads 1,882201


Cite as