Published September 16, 2021
| Version 1.0.0
Dataset
Open
Cross-Register Authorship Attribution Corpus
Creators
- 1. Indiana University Bloomington
- 2. Shanghai Normal University
Description
This corpus contains writings of eight authors known to have written in both vernacular and classical Chinese. The corpus has 4.2 million Chinese characters and can be useful in authorship identification research.
The file README.md contains a full description of the data.
All materials in this archive are in the public domain.
Files
cross-register-authorship-attribution-corpus-v1.0.0.zip
Files
(8.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:9748d57f64b85905da602e1b819df0ac
|
8.0 MB | Preview Download |