Published December 22, 2024 | Version 0.1
Model Open

HJ-Ky-0.1: fastText Models

Description

Here we release fastText embeddings described in the paper as follows:

[...] we trained fastText embeddings on the Leipzig Corpus data. The training scheme was alsoSkip-Gram Negative Sampling, with 10 epochs, vector dimensions of 100 and 300, a window size of 5,and 10 negative samples. Character n-grams of 3 to 6 characters and 2,000,000 hashing buckets wereused for the hashing trick.

Files

Files (3.7 GB)

Name Size Download all
md5:c1fee3674299bcb1bc1aa6213f78318b
930.5 MB Download
md5:6f8e53b4e742321215ae1f8ff4f89869
2.8 GB Download

Additional details

Related works

Is supplement to
Journal article: 10.56634/16948335.2023.4.1723-1731 (DOI)
Preprint: arXiv:2411.10724 (arXiv)