There is a newer version of this record available.

Dataset Open Access

The University of Pittsburgh English Language Institute Corpus (PELIC)

Alan Juffs; Na-Rae Han; Ben Naismith

This is the first public release of the dataset from the University of Pittsburgh English Language Institute Corpus (PELIC). PELIC is a publicly available 4.2-million-word learner corpus of written texts. These texts were collected in an English for Academic Purposes (EAP) context over seven years in the University of Pittsburgh’s Intensive English Program and were produced by over 1100 students with a wide range of linguistic backgrounds and proficiency levels. PELIC is longitudinal, offering greater opportunities for tracking development in a natural classroom setting. In addition to the data, the PELIC repository contains corpus statistics and tutorials on how to access and analyze the data.

Corpus homepage: https://eli-data-mining-group.github.io/Pitt-ELI-Corpus/
Files (230.7 kB)
Name Size
ELI-Data-Mining-Group/PELIC-dataset-v1.0.zip
md5:6fe0a2f5c551b13cab17f827d6590cb8
230.7 kB Download
159
14
views
downloads
All versions This version
Views 159134
Downloads 1411
Data volume 6.4 MB2.5 MB
Unique views 132115
Unique downloads 1310

Share

Cite as