Published July 13, 2017 | Version 1.0
Dataset Open

Jingju Lyrics Datasets

  • 1. Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain


In order to study the expressive functions of jingju metrical patterns according to its lyrics, a series of different datasets have been created from the Jingju Lyrics Collection, that has been collected through scraping the online repository of jingju libretti Zhongguo jingju xikao 中国京剧戏考. These datasets have been created for the analysis of lyrics of the banshi yuanban, manban, kuaiban and yaoban both in the shengqiang xipi and erhuang (kuaiban is not used in erhuang) by applying NLP techniques, namely topic modelling and document classification.

Using this dataset

We are interested in knowing if you find our datasets useful! If you use our dataset please email us at and tell us about your research.


Files (15.4 MB)

Name Size Download all
15.4 MB Preview Download