Published January 12, 2022 | Version v2
Dataset Open

Transindex

Description

This object has been created as a part of the web harvesting project of the Eötvös Loránd University Department of Digital Humanities ELTE DH. Learn more about the workflow HERE about the software used HERE.The aim of the project is to make online news articles and their metadata suitable for research purposes. The archiving workflow is designed to prevent modification or manipulation of the downloaded content. The current version of the curated content with normalized formatting in standard TEI XML format with Schema.org encoded metadata is available HERE. The detailed description of the raw content is the following:

  • The portal's archived content (from 2001-01-01 to 2021-05-22) in WARC format available HERE (crawled: 2021-05-21T10:01:38.592950 - 2021-05-22T20:50:22.079445).

Please fill in the following form before requesting access to this dataset:ACCES FORM

Files

README.md

Files (146 Bytes)

Name Size Download all
md5:ef949fa712a4e184fd9d88f0e5007598
146 Bytes Preview Download

Additional details

Related works

Has part
Dataset: 10.5281/zenodo.4899469 (DOI)
Dataset: 10.5281/zenodo.5828866 (DOI)

Dates

Collected
2001-01-01/2021-05-22
content publication date interval provided by source