Published February 19, 2021
| Version v1
Dataset
Open
Pokémon Story Corpus
Description
The larger corpus consists of fan written stories about Pokémon. The corpus is sentence and word tokenized. The order of sentences is shuffled for copyright reasons. The smaller corpus is a Pokémon description corpus for the first 151 Pokémon.
Sources: https://www.fanfiction.net/ and https://www.giantbomb.com/
Please cite the following paper if you use the resources:
Hämäläinen, M., Alnajjar, K. & Partanen, N. (2021). Nettikorpuksen avulla tuotettuja sanavektorimalleja Pokémonien ominaisuuksien kuvaamiseksi. In Saarikivi, T. & Saarikivi, J. (eds.) Turhan tiedon kirja — Tutkimuksista pois jätettyjä sivuja. p. 199-214. SKS Kirjat
Translation of the paper in English
Files
pokemon-descriptioncorpus.json
Files
(2.3 GB)
Name | Size | Download all |
---|---|---|
md5:a88d22c82736f01053c1b4b029d0b867
|
253.4 kB | Preview Download |
md5:1ce50304d97164aced67be2e784ae9d4
|
2.3 GB | Preview Download |