Published April 22, 2023 | Version v2
Dataset Open

FR-L-MIGR-TWIT Corpus. Migration Tweets of French Left-wing Politics.

  • 1. Université de Lille

Description

The FR-L-MIGR-TWIT Corpus is part of the MIGR-TWIT CORPUS, diachronic bilingual corpus of Tweets about the topic of migration in Europe.
Within the framework of the collaborative research project OLiNDiNUM (Observatoire LINguistique du DIscours NUMérique, [Linguistic Observatory of Online Debate]), the MIGR-TWIT Corpora are created with the aim to study the evolution of the public discourse on migration in Europe during the past dozen years from 2011 to 2022. First two components of the corpus represent migration discourse of right-wing politics in France and in the UK. The FR-L-MIGR-TWIT Corpus represents French left-wing politics' migration discourse on Twitter.  

Using the Twitter API v2 Academic Research, the Tweets containing at least one occurrence of lexicon derived from a latin root "migr" of migrare are automatically retrieved from 23 Twitter accounts of French left-wing political figures and parties.
 

Scientific reference :  Jeon, S. (2025). Le discours numérique sur l'immigration en France entre 2011 et 2022. Une analyse de corpus (Online Discourse on Immigration in France between 2011 and 2022. A Corpus Analysis), PhD Thesis, Université de Lille, France.

Contents
The downloadable version of FR-L-MIGR-TWIT-2011-2022 Corpus contains 32 CSV files (tabular format). The corpus is presented in simplified and complete versions in terms of metadata. The simplified version corresponds to one single file named FR-L-MIGR-TWIT-2011-2022.csv, containing four basic (meta)data, i.e. identifier, text, posting date and username (that is, data__iddata__textdata__created_at and author__name as the table hearder elements). In addition to these four (meta)data, the elaborate version is provided with all Tweet fields information included as a header element, such as the numbers of Replies, Retweets, Likes and Quotes, etc. This version is also available in one single CSV file named FR-L-MIGR-TWIT-2011-2022_meta.csv.

Besides, the elaborate version is provided with three CSV Zip files: 7 CSV files in the zip file named FR-L-MIGR-TWIT-YEAR_meta correspond to grouped years (i.e. FR-L-MIGR-TWIT-2011-2016_meta.csv) or each and every year (e.g. FR-L-MIGR-TWIT-2017_meta.csv, and so on) for the last dozen years. 23 files in the zip file named FR-L-NAME-MIGR-TWIT_meta for each and every component of selected French left-wing political figures and parties (e.g. FR-L-Arthaud-TWIT_meta.csv). The zip file named FR-L-MIGR-TWIT-2011-2022_meta contains yearly Tweets of each and every component of political figures and parties.

Detailed information of the FR-L-MIGR-TWIT-2011-2022 CORPUS is illustrated below.

  • Created at: 2023-04-18
  • Language: FR
  • Coverage: 23 user accounts ; 5,636 Tweets ; 169,818 words
  • Time of data collection: start=2011-01-01 ; end=2022-06-30
  • Keywords: words derived from a latine root “migr” of migrare
  • Corpus composition:

 

Political Figure/party

Type of representative

Username

migr-Tweets

1

Adrien Quatennens

PERSON (M)

@AQuatennens

315

2

Alexis Corbière

PERSON(M)

@Alexiscorbiere

209

3

Anne Hidalgo

PERSON (F)

@Anne_Hidalgo

801

4

Arnaud Montebourg*

PERSON (M)

@montebourg

7

5

Benoît Hamon

PERSON (M)

@benoithamon

172

6

Christiane Taubira

PERSON (F)

@ChTaubira

11

7

Clémentine Autain

PERSON (F)

@Clem_Autain

102

8

Danièle Obono

PERSON (F)

@Deputee_Obono

415

9

Esther Benbassa**

PERSON (F)

@EstherBenbassa

936

10

François Hollande

PERSON (M)

@fhollande

28

11

François_Ruffin

PERSON (M)

@Francois_Ruffin

19

12

Jean-Luc Mélenchon

PERSON (M)

@JLMelenchon

240

13

Manon Aubry

PERSON (F)

@ManonAubryFr

182

14

Natalie Arthaud

PERSON (F)

@n_arthaud

165

15

Philippe Poutou

PERSON (M)

@PhilippePoutou

83

16

Raphael Glucksmann

PERSON (M)

@rglucks1

142

17

Yannick Jadot

PERSON (M)

@yjadot

374

18

Europe Écologie-Les Verts

ORGANIZATION

@EELV

484

19

Gauche Républicaine et Socialiste

ORGANIZATION

@Gauche_RS

73

20

Génération.s

ORGANIZATION

@GenerationsMvt

165

21

La France Insoumise

ORGANIZATION

@FranceInsoumise

300

22

Parti Radical Gauche

ORGANIZATION

@PartiRadicalG

37

23

Parti Socialiste

ORGANIZATION

@partisocialiste

376

  • Political figures and parties, listed in alphabetical order, are selected according to the four criteria: (1) the high number of migr-tweets, (2) the political affiliation, (3) the political careers, that is, the Member of the European Parliament or (4) the presidential candidate during the period between 2011 and 2022. These four criteria are not mutually exclusive.
  • As part of a doctoral thesis (Jeon, 2025), the FR-L-MIGR-TWIT and FR-R-MIGR-TWIT corpora are compiled, annotated and analyzed through a comparative discourse analysis approach, with the aim to study the semantic construction of migr-lexicon over the period between 2011 and 2022.
  • *One migration Tweet retrieved from the user account @montebourg for the year of 2019 was removed and is not included in his 7 migr-tweets because it refers to the issue of the migration of honey bees.
  • **We later added the user account @EstherBenbassa represented by Esther Benbassa, senator and former member of political party Europe Écologie-Les Verts (representative of the user account @EELV), because of the high number of her migr-tweets that were retweeted by @EELV.

The MIGR-TWIT Corpus consists of three subcorpora for a total amount of 23,869 Tweets and 703,016 words:

  • FR-R-MIGR-TWIT-2011-2022 Corpus: French Right-wing politics' migr-tweets
  • UK-R-MIGR-RA-TWIT-2011-2022 Corpus: British Right-wing politics' migr-tweets
  • FR-L-MIGR-TWIT-2011-2022 Corpus: French Left-wing politics' migr-tweets 

 

Notes

This corpus is distributed under the Creative Commons CC-BY-NC-SA 4.0 license (https://creativecommons.org/licenses/by-nc-sa/4.0/). Its reuse is permitted for non-commercial purposes, including research and education.

Series information

New version information

This dataset has been integrated into the FR-MIGR-TWIT Corpus 2.0.

Files

FR-L-MIGR-TWIT-2011-2022.csv

Files (10.6 MB)

Name Size Download all
md5:2f17eed1583f1f067c7aa24358ccc644
1.6 MB Preview Download
md5:444f23393ab567f7f146e500fc88abcd
5.4 MB Preview Download
md5:f3ca70f2472a9d5590009b67f7f26bc4
1.3 MB Preview Download
md5:e58da5ea26cf01d4b7a5d33f30b09640
1.2 MB Preview Download
md5:8a5775a5e883bedc07b5cd2de0c75c14
1.2 MB Preview Download

Additional details

Related works

Continues
Dataset: 10.5281/zenodo.7347479 (DOI)
Is continued by
Dataset: 10.5281/zenodo.17652657 (DOI)
Is previous version of
Dataset: 10.5281/zenodo.17828433 (DOI)

Funding

Université Lille Nord de France
Projet d'INternalisation 2021
Campus France
Hubert Curien Partnerships PHC Van Gogh 2018-19
Campus France
Hubert Curien Partnerships PHC Galilée 2018-19

References

  • Jeon, S. (2025). Le discours numérique sur l'immigration en France entre 2011 et 2022. Une analyse de corpus (Online Discourse on Immigration in France between 2011 and 2022. A Corpus Analysis). PhD thesis, Université de Lille.