Published February 5, 2025 | Version 1.0
Dataset Open

TV BRICS Archive of Titles and Links in English, Russian, Portuguese, Chinese, Spanish and Arabic

  • 1. ROR icon German Marshall Fund of the United States
  • 2. Alliance for Securing Democracy

Description

The dataset extracted by the script contains news article metadata from the TV BRICS website, spanning multiple languages. The dataset is structured as a CSV file.

Coliumns

  • Language: The language in which the news article is published. The dataset includes news articles in Russian, English, Portuguese, Chinese, Spanish, and Arabic.
  • Title: The headline or title of the news article.
  • Path: The relative URL path of the article on TV BRICS (e.g., /news/iran-zapustil-pervuyu-v-islamskom-mire-platformu-dlya-nauchnogo-obmena/).
  • Link: The full URL to the news article, constructed using the website domain (e.g., https://tvbrics.com/news/iran-zapustil-pervuyu-v-islamskom-mire-platformu-dlya-nauchnogo-obmena/).

Characteristics of the Dataset

  • Multilingual Scope: The dataset includes articles from different linguistic sections of the website, making it suitable for comparative media analysis across languages.
  • Structured and Uniform Format: Each entry contains a standardized format with a title, relative path, and absolute URL.
  • Pagination-Based Extraction: Articles are fetched from multiple pages per language, ensuring a broad coverage of news over time.
  • Chronologically Ordered: The scraping script sorts the results by publication date in descending order, capturing the most recent articles first.
  • Deduplication Considerations: The script prevents redundant entries by checking if the first scraped article on a page already exists in the dataset.

Potential Uses

  • Comparative News Analysis: Investigating how different linguistic versions of TV BRICS report on the same events.
  • Disinformation and Influence Studies: Analyzing narratives and framing across language-specific editions.
  • International Media Monitoring: Tracking coverage trends across geopolitical regions.
  • Linguistic and Sentiment Analysis: Evaluating tone, sentiment, and framing variations by language.

This dataset serves as a structured repository of TV BRICS articles, facilitating further research in media studies, information operations, and digital propaganda analysis.

 

 

Files

tvbrics_news_archive.csv

Files (10.3 MB)

Name Size Download all
md5:6af358066c1146cb7731542bc3b45117
10.3 MB Preview Download

Additional details

Dates

Collected
2025-02-05

Software

Repository URL
https://github.com/pbenzoni/tvbrics_scraper
Programming language
Python
Development Status
Inactive