Dataset Open Access
This archive contains the #Élysée2017fr dataset.
(Initially published at https://web.archive.org/web/20200530171644if_/https://dataverse.mpi-sws.org/dataverse/icwsm18 on June 24, 2018. This dataverse being defunct now, we repost on Zenodo)
The keywords used to collect the initial dataset, each presented with the start and stop dates of use (date format: YYYY-MM-DD).
The manual profiles annotations. The file contains the following columns:
The profile's id used by Twitter
"**individual**" if the profile is managed by a single person, else "**non individual**".
The "**non individual**" label is itself divided in 3 subcategories:
- "**political**" for profiles of political parties or associations, and profiles representing groupes of militants.
- "**media**" for profiles of media outlets.
- "**other**" for profiles not included in the previous categories.
The profile's political affiliation(s), indicated as the shortcut for the political party:
- "**fi**": France Insoumise (far-left)
- "**ps**": Parti Socialiste (left)
- "**em**": En Marche ! (center)
- "**lr**": Les Républicains (right)
- "**fn**": Front National (far-right)
- **null**: no political affiliation
When a profile has 2 affiliations, they are separated by a slash (ex: "ps/fi").
*For individual profiles only.*
Indicates if the profile's owner self-identify as a media professional (journalist, editorialist, ...)
*For individual profiles only.*
Indicates the sex of the profile's owner:
- "**m**": male
- "**f**": female
- **null**: undetermined or other
Files containing the tweets and retweets ids, divided according to the political affiliation of their authors for more flexibility.
- **posts_ids_fi.csv**: Tweet ids for profiles affiliated to France Insoumise (far-left)
- **posts_ids_ps.csv**: Tweet ids for profiles affiliated to Parti Socialiste (left)
- **posts_ids_em.csv**: Tweet ids for profiles affiliated to En Marche ! (center)
- **posts_ids_lr.csv**: Tweet ids for profiles affiliated to Les Républicains (right)
- **posts_ids_fn.csv**: Tweet ids for profiles affiliated to Front National (far-right)
- **posts_ids_multi_affiliations.csv**: Tweet ids for profiles affiliated to more than one party
- **posts_ids_indetermined.csv**: Tweet ids for profiles not affiliated to any party
Each file contains one tweet id per line.
Files containing the mention and retweet networks, in NCOL and GraphML format.
The NCOL files contains the directed weighted edges between profiles, one per line, in the following format:
profile1_twitter_id profile2_twitter_id edge_weight
The GraphML files contains the directed weighted edges between profiles, as well as all the profiles annotations presented in *profiles_annotations.csv*. They can be opened using a graph visualisation software like Gephi.
## How to get tweets from ids
You can use various tools to help you get tweets from their ids, we suggest the following:
- DMI-TCAT: https://github.com/digitalmethodsinitiative/dmi-tcat
- Twarc: https://github.com/DocNow/twarc
## How to cite this work
Fraisier Ophélie, Cabanac Guillaume, Pitarch Yoann, Besançon Romaric, Boughanem Mohand. 2018. #Élysée2017fr: the French Presidential Election on Twitter. In International Conference on Weblogs and Social Media. https://aaai.org/ocs/index.php/ICWSM/ICWSM18/paper/view/17821 (https://hal.archives-ouvertes.fr/hal-02319715)