Published November 27, 2025
| Version v1
Dataset
Open
20 Newsgroups (5 Topics) — PII-Augmented version
Authors/Creators
Description
Description
This dataset is a curated subset of the 20 Newsgroups corpus, containing 5 clearly distinguishable topics for experimentation with intelligent text anonymization and topic classification
It was created as part of the Bachelor’s thesis “Intelligent anonymization for natural language processing and inference” at FIIT STU, 2025
Files
20 Newsgroups (5 Topics) — PII-Augmented version.pdf
Files
(58.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:9bb54fa61af52112be49952112aa75be
|
58.0 kB | Preview Download |