Published February 5, 2025
| Version 1.0
Dataset
Open
TV BRICS Archive of Titles and Links in English, Russian, Portuguese, Chinese, Spanish and Arabic
Authors/Creators
Description
The dataset extracted by the script contains news article metadata from the TV BRICS website, spanning multiple languages. The dataset is structured as a CSV file.
Coliumns
- Language: The language in which the news article is published. The dataset includes news articles in Russian, English, Portuguese, Chinese, Spanish, and Arabic.
- Title: The headline or title of the news article.
- Path: The relative URL path of the article on TV BRICS (e.g.,
/news/iran-zapustil-pervuyu-v-islamskom-mire-platformu-dlya-nauchnogo-obmena/). - Link: The full URL to the news article, constructed using the website domain (e.g.,
https://tvbrics.com/news/iran-zapustil-pervuyu-v-islamskom-mire-platformu-dlya-nauchnogo-obmena/).
Characteristics of the Dataset
- Multilingual Scope: The dataset includes articles from different linguistic sections of the website, making it suitable for comparative media analysis across languages.
- Structured and Uniform Format: Each entry contains a standardized format with a title, relative path, and absolute URL.
- Pagination-Based Extraction: Articles are fetched from multiple pages per language, ensuring a broad coverage of news over time.
- Chronologically Ordered: The scraping script sorts the results by publication date in descending order, capturing the most recent articles first.
- Deduplication Considerations: The script prevents redundant entries by checking if the first scraped article on a page already exists in the dataset.
Potential Uses
- Comparative News Analysis: Investigating how different linguistic versions of TV BRICS report on the same events.
- Disinformation and Influence Studies: Analyzing narratives and framing across language-specific editions.
- International Media Monitoring: Tracking coverage trends across geopolitical regions.
- Linguistic and Sentiment Analysis: Evaluating tone, sentiment, and framing variations by language.
This dataset serves as a structured repository of TV BRICS articles, facilitating further research in media studies, information operations, and digital propaganda analysis.
Files
tvbrics_news_archive.csv
Files
(10.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:6af358066c1146cb7731542bc3b45117
|
10.3 MB | Preview Download |
Additional details
Dates
- Collected
-
2025-02-05
Software
- Repository URL
- https://github.com/pbenzoni/tvbrics_scraper
- Programming language
- Python
- Development Status
- Inactive