WikiTextGraph: A Python Tool for Parsing Multilingual Wikipedia Text and Graph Extraction
Authors/Creators
Description
WikiTextGraph is an open-source Python package designed to extract and process text from Wikipedia dumps and construct internal link networks across multiple language editions. It uses efficient parsing, redirect resolution, and multilingual graph-building techniques to tackle the challenges of Wikipedia’s scale, structure, and inherent noise. With a modular architecture and a simple graphical user interface (GUI), it is suitable for both technical and non-technical users. Built for scalability and reproducibility, WikiTextGraph supports interdisciplinary research in network science, computational linguistics, and digital humanities. Its flexible design enables easy adaptation for tasks involving low-resource or cross-lingual language studies.[1]
Files
README.md
Files
(58.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:e22b744f4a9b09da09d4be511c3b8714
|
2.3 kB | Download |
|
md5:afd478c6482908ef5de2a8649b3acab1
|
7.3 kB | Download |
|
md5:21b9eea6c4f8aa9fae8d766c643bd61a
|
8.5 kB | Download |
|
md5:933d35cc0c8f79bbfa1330ffc19033cb
|
6.5 kB | Download |
|
md5:86d3f3a95c324c9479bd8986968f4327
|
11.4 kB | Download |
|
md5:9546c0faab48f5e80cdc78181744ba2d
|
7.3 kB | Download |
|
md5:5bfc01dc9efde70c33b63a3cbed868d4
|
7.0 kB | Preview Download |
|
md5:009d00c94160a431388a6cf0e4a05757
|
649 Bytes | Preview Download |
|
md5:ca80d0b9a9f6b6db91e087293b8bba9a
|
4.6 kB | Download |
|
md5:c5c1f3345d8bbb5cc491e88848b2630d
|
2.7 kB | Download |
Additional details
Software
- Repository URL
- https://github.com/PaschalisAg/WikiTextGraph
- Programming language
- Python
- Development Status
- Active