Published August 20, 2025 | Version v2
Dataset Open

Enriching KoLLA with Multi-reference Annotations and Rubric-Based Scoring

Contributors

Description

We enhance the KoLLA Korean learner corpus by adding multiple grammatical error correction (GEC) references, which enable more nuanced and flexible evaluation of GEC systems and better capture the variability of human language. In addition, we enrich the corpus with rubric-based scores aligned with guidelines from the Korean National Language Institute, providing measures of grammatical accuracy, coherence, and lexical diversity. Together, these enhancements establish KoLLA as a robust and standardized resource for research in Korean L2 education, supporting advances in language learning, assessment, and automated error correction.

 

Original KoLLA (v1.0), including learners’ written texts, is available at https://cl.indiana.edu/~kolla/

Files

FB.csv

Files (358.1 kB)

Name Size Download all
md5:0ffe834b41d72cb86bcacabe7f051e3d
2.1 kB Preview Download
md5:64b85fd0992476725716876be6b73d6c
2.1 kB Preview Download
md5:9e2555261498394c0318f6d40f72079e
2.1 kB Preview Download
md5:eede0dc8fee2fa45c76e23cbed94e78e
2.1 kB Preview Download
md5:9a6f2e3fea1b39bbb7343445db1167f7
349.8 kB Download

Additional details

Additional titles

Alternative title (Korean)
KoLLA v2

Dates

Updated
2025-08-20