Published February 5, 2025
| Version 1.0
Dataset
Open
TV BRICS Archive of Titles and Links in English, Russian, Portuguese, Chinese, Spanish and Arabic
Creators
Description
The dataset extracted by the script contains news article metadata from the TV BRICS website, spanning multiple languages. The dataset is structured as a CSV file.
Coliumns
- Language: The language in which the news article is published. The dataset includes news articles in Russian, English, Portuguese, Chinese, Spanish, and Arabic.
- Title: The headline or title of the news article.
- Path: The relative URL path of the article on TV BRICS (e.g.,
/news/iran-zapustil-pervuyu-v-islamskom-mire-platformu-dlya-nauchnogo-obmena/
). - Link: The full URL to the news article, constructed using the website domain (e.g.,
https://tvbrics.com/news/iran-zapustil-pervuyu-v-islamskom-mire-platformu-dlya-nauchnogo-obmena/
).
Characteristics of the Dataset
- Multilingual Scope: The dataset includes articles from different linguistic sections of the website, making it suitable for comparative media analysis across languages.
- Structured and Uniform Format: Each entry contains a standardized format with a title, relative path, and absolute URL.
- Pagination-Based Extraction: Articles are fetched from multiple pages per language, ensuring a broad coverage of news over time.
- Chronologically Ordered: The scraping script sorts the results by publication date in descending order, capturing the most recent articles first.
- Deduplication Considerations: The script prevents redundant entries by checking if the first scraped article on a page already exists in the dataset.
Potential Uses
- Comparative News Analysis: Investigating how different linguistic versions of TV BRICS report on the same events.
- Disinformation and Influence Studies: Analyzing narratives and framing across language-specific editions.
- International Media Monitoring: Tracking coverage trends across geopolitical regions.
- Linguistic and Sentiment Analysis: Evaluating tone, sentiment, and framing variations by language.
This dataset serves as a structured repository of TV BRICS articles, facilitating further research in media studies, information operations, and digital propaganda analysis.
Files
tvbrics_news_archive.csv
Files
(10.3 MB)
Name | Size | Download all |
---|---|---|
md5:6af358066c1146cb7731542bc3b45117
|
10.3 MB | Preview Download |
Additional details
Dates
- Collected
-
2025-02-05
Software
- Repository URL
- https://github.com/pbenzoni/tvbrics_scraper
- Programming language
- Python
- Development Status
- Inactive