Published December 11, 2025
| Version v3
Dataset
Open
Wikipedia Articles with math tags
Authors/Creators
Description
The dataset contains the filtered latest dump in all languages that contain the wiki text math in the source code. Note that if math is typest via templates or without using math tags, the pages won't be included. It also includes dumps from other wiki projects such as wikiversity filtered after the same criteria. In total this results to 656 files of 1031 wikis.
The dumps were created using the wikiFiler script.
The script can be accessed via
To improve reproducibility, we also added the input hashes beginning from this version (input hashes for all 1031 wikis).
No dumps exist for the following (not public?) wikis
- arbcom_plwiki
- officewiki
- tokwiki
Notes
Files
inputHashes.csv
Files
(1.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:c254568ebdd1e0e5de035d30bca54088
|
331.3 kB | Preview Download |
|
md5:292c0c623edc081a2cbfe1778ddcc633
|
1.6 GB | Download |