Dataset of Spanish Parliamentary Interventions by Legislature (2000–2023)
Creators
Description
This repository contains the datasets of parliamentary records from the Spanish Congress covering legislatures 7 to 14 (2000–2023). The material was collected, cleaned, and pre-processed as part of the study on ideological and affective polarisation in the Spanish parliament, presented in the paper "Analyzing polarization among Spanish political elites using Machine Learning techniques".
Each file corresponds to one legislature and is provided in Parquet format (compressed with gzip). The files include:
-
Full text of parliamentary interventions.
-
Metadata of each speech (date, speaker, party, session).
-
Pre-processed text fields for Natural Language Processing (NLP) applications.
These datasets allow replication of the analyses presented in the article and provide a resource for further research on Spanish political discourse, sentiment analysis, and ideology mapping.
Files
Files
(781.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:c89e320bf7d58d4cb9f72ffb6b776dc5
|
148.7 MB | Download |
|
md5:ade97480e89f4d398122026dee9ab992
|
145.8 MB | Download |
|
md5:889814fa54b5ea5344fedacba028b98e
|
100.5 MB | Download |
|
md5:958f5d4040ff8de26227cbe796733cdd
|
144.5 MB | Download |
|
md5:028c7aff2fa7fbd8a077bdffe353ff25
|
4.5 MB | Download |
|
md5:f12cc506a8a783af8b22f5ee0851cb78
|
104.0 MB | Download |
|
md5:f8c8a67a5a785dc6b8889ef17457f2be
|
2.2 MB | Download |
|
md5:453f66257b72b4f5fc262046d3c47a09
|
131.8 MB | Download |
Additional details
Related works
- Has part
- Publication: 10.31235/osf.io/ry4g2 (DOI)
Dates
- Available
-
2025-08-28Parliamentary corpus interventions
Software
- Repository URL
- https://github.com/dibuja/polarisation-nlp
- Programming language
- Python
- Development Status
- Active