Published May 11, 2021 | Version v1
Dataset Open

Catalan Referendum Twitter corpus

Description

This corpus consists of 46,962 tweets related to the Catalan referendum, a very controversial topic in Spain due to it was an independence referendum called by the Catalan regional government and suspended by the Constitutional Court of Spain after a request from the Spanish government. All the tweets were downloaded on October 1, 2017 with the hashtags #CatalanReferendum or #ReferendumCatalan. Later, we collected features of these tweets on October 31, 2017 in order to analyze their virality. Each item in this collection is made up of the features we used from each tweet to perform the virality analysis:

  • lang: Tweet language.
  • retweet_count: Total number of retweets recorded for a given tweet.
  • favourite_count: Total number of favourites recorded for a given tweet.
  • is_quote_status: Whether a tweet includes a quote of another tweet.
  • num_hashtags: Total number of hashtags in the tweet.
  • num_urls: Total number of URLs in the tweet.
  • num_mentions: Total number of users mentioned in the tweet.
  • interval_time: Interval of the day on which the tweet was published (morning (06:00-12:00), afternoon (12:00-18:00), evening (18:00-00:00) or night (00:00-06:00)).
  • positive_words_iSOL: Total number of positive words found in the tweet using iSOL lexicon.
  • negative_words_iSOL: Total number of negative words found in the tweet using iSOL lexicon.
  • positive_words_NRC: Total number of positive words found in the tweet using NRC lexicon.
  • negative_words_NRC: Total number of negative words found in the tweet using NRC lexicon.    
  • positive_words_mlSenticon: Total number of positive words found in the tweet using ML-SentiCon lexicon.
  • negative_words_mlSenticon: Total number of negative words found in the tweet using ML-SentiCon lexicon.
  • verified_user: Whether the tweet is from a verified user.
  • followers_count_user: Total number of users who follow the author of a tweet.
  • friends_count_user: Total number of friends that the author is following.
  • listed_count_user: Total number of lists that include the author of a tweet.
  • favourites_count_user: Total number of favourited tweets by a user.
  • statuses_count_user: Total number of tweets made by the author since the creation of the account.

Notes

Funding provided by: LIVING-LANG project from the Spanish Government
Crossref Funder Registry ID: http://dx.doi.org/None
Award Number: RTI2018-094653-B-C21

Funding provided by: Junta de Andalucía
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100011011
Award Number: DOC_01073

Funding provided by: European Social Fund
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100004895

Funding provided by: Ministerio de Educación, Cultura y Deporte
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100003176
Award Number: FPU014/00983

Funding provided by: European Regional Development Fund
Crossref Funder Registry ID: http://dx.doi.org/10.13039/501100008530

Files

Catalan_Referendum_Twitter_corpus.csv

Files (3.1 MB)

Name Size Download all
md5:6a8bcc01dd8989aba6b5ec123ffcf02d
3.1 MB Preview Download