Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published April 30, 2021 | Version v2.0.0
Dataset Open


  • 1. University of Barcelona
  • 1. University of Barcelona
  • 2. University of Helsinki
  • 3. Rijksuniversiteit Groningen
  • 4. Universita' degli studi di Milano Bicocca
  • 5. Instituto Superior de Ciencias Sociais e Politicas
  • 6. University of Oxford


This dataset is part of EC Horizon 2020 project ALLINTERACT Widening and diversifying citizen engagement in science (872396).
It contains the raw data obtained from the fieldwork, which consists of: 1) Literature Review, 2) Social Media Analytics, 3) Focus Groups, 4) Survey and 5) Social Media Communicative Observation.
1) Literature Review
The objective of the literature review was to address the following topics in gender and education: a) How citizens’ benefit from scientific research, b) Citizen awareness of the impact of scientific research, c) Awareness-raising initiatives succeeding at engaging citizens in scientific participation, including the Open Access movement and citizen science initiatives, d) Awareness-raising actions that foster the recruitment of new talent in sciences and e) Policies that promote awareness-raising actions and citizen engagement in science. 
In order to do so, the searches were carried out in the top scientific databases, namely Web of Science (mainly in those journals indexed in Journal Citation Reports) and Scopus. The articles were published between 2010-2021 in journals indexed Q1 or Q2 in JCR or in Q1 journals indexed in Scopus. Relevant reports from EU-funded research projects and official EU documents were also included.
We provide one word file with the following information of each topic (a-e) in gender and education.
-    Keywords used
-    Criteria of selection
-    Identified sources
-    Outcomes
-    Annexes: Grids with the details of the identified socurces

2) Social Media Analytic
It is the raw data obtained from social media interactions (Twitter, Facebook, Instagram and Reddit) among citizens about citizen participation in science and research with social impact related to two Sustainable Development Goals: Quality Education and Gender Equality. 
The data collection followed a twofold strategy 1) Top-Down, in which researchers identified and selected relevant Twitter and Instagram hashtags and Facebook and Reddit pages and 2) Bottom-Up, in which Twitter hashtags were selected based on daily Trending Topics.
The data was collected between March 9th and March 16th 2021 and has been obtained, cleaned and anonymized following Allinteract - Social Media Analytics Protocol (Flecha & Pulido, 2021).
We provide five Excel files (one for each social network explored). Each file contains the main information of the extracted messages, however the information extracted in each case is slightly different. 
-Twitter: Tweet ID, Time, Tweet Type, Retweeted By, Number of Retweets, Hashtags
-Facebook: Post ID, Video, Type, Likes, Created Time, Updated Time, Comment ID, Comment Likes, Comment Time, Page Likes
-Instagram: Likes, comments, date
-Reddit: Row ID, sub_id, sub_title, sub_score, sub_date, comment_id, comment_score, comment_date

3) Focus Groups
This data file contains the pseudonymized transcription of a total of 6 focus groups in gender and 6 in education, which were conducted between October 2021 and February 2022. These focus groups are the pre-test and therefore, the groups are distributed in control group or experimental group. The participants of the gender focus groups were women (including vulnerable women) from a women’s group, members of an LGBTQI group and women (including young women) from a women’s group. The participants of the education focus groups were parents, teachers and students.
We provide a word file with the literal transcriptions of the focus groups in the language in which the focus groups were conducted (English, Spanish or Portuguese).

4) Survey
This data file contains the anonym answers of the survey conducted with participants from 12 countries, through a CATI/CAWI method. The survey was conducted between November 2021 and February 2022 and consists of 59 questions. The exploitation of this data has been carried out with the SPSS software. 
We provide an excel file with the 59 questions and the answers of 7507 participants.

5) Social Media Communicative Observation
The Social Media Communicative Observation aims to explore the effects of introducing scientific pieces of evidence in social media interactions as an initiative to increase participation through awareness. In order to do so, scientific evidence on gender and education were introduced in 10 Facebook groups (5 related to gender and 5 to education), 10 Reddit communities (5 related to gender and 5 to education) and 2 Social Impact Platforms (Sappho and Adhyayana). 
We provide an excel file with the anonymized interactions among users around the introduced piece of evidence. This Excel file contains the following information: Group of documents, document name, code, start, final, weight, segment, changed by, changed, created, comment, area and percentage (%).

6) Focus Group – Post test

This data file contains the pseudonymized transcription of a total of 6 focus groups post test

Funding: We acknowledge support of this work by the project "ALLINTERACT Widening and diversifying citizen engagement in science” (872396) from the European Commission Horizon 2020 programme. 

Contact information
Ramón Flecha (PI):
Marta Soler Gallart (KMC Coordinator):
Pavel Oveiko (Ethics Chair):

Flecha, R., & Pulido, C. (2021). Allinteract - Social Media Analytics Protocol is licensed under a Creative Commons Attribution - NonCommercial - ShareAlike 4.0 International License is available in

How to cite this dataset
Soler-Gallart, M. (2021). D1.1.Allinteract Raw Data is licensed under a Creative Commons Attribution - NonCommercial - ShareAlike 4.0 International License



Files (15.1 MB)

Name Size Download all
5.8 kB Preview Download
14.3 MB Preview Download
857.1 kB Preview Download

Additional details

Related works

Is derived from
Technical note: ark:/13960/t3zt36n92 (ARK)


ALLINTERACT – Widening and diversifying citizen engagement in science 872396
European Commission


  • Flecha, R., & Pulido, C. (2021). Allinteract - Social Media Analytics Protocol is licensed under a Creative Commons Attribution - NonCommercial - ShareAlike 4.0 International License is available in