Published June 5, 2024 | Version 2.0
Dataset Open

A Dataset of Multilingual Facebook Comments on Moros and Armed Conflict in the Southern Philippines

  • 1. ROR icon University of Antwerp
  • 2. ROR icon Ghent University
  • 3. ROR icon University of the Philippines Diliman

Description

This dataset is a collection of 12,478 social media comments found on the official Facebook pages of ten Philippine newspapers, The Philippine Daily Inquirer, Manila Bulletin, The Philippine Star, The Manila Times, Sunstar Cebu, Sunstar Davao, Cebu Daily News, The Freeman, Sunstar Davao, MindaNews, and The Mindanao Times, spanning the years 2015, 2017 and 2019. The comments contain terms related to the Moro identity and the Mamasapano Clash, the Marawi Siege and the establishment of BARMM in the southern Philippines, allowing researchers to study semantic fields with regard to Muslims and the relationship between the texts and the source newspaper, their region of origin, and political administration, among other variables. All comments in the dataset were downloaded through Facebook's Graph API via Facepager (Jünger & Keyling, 2019).

One CSV file (MMB151719SOCMED_v2.csv) is provided, along with a codebook that contains descriptions of the variables and codes used in the CSV file, and a Readme document with a changelog. 

Each social media comment is annotated with the following metadata: 

  1. object_id: identifier associated with the comment;
  2. message: the textual string of the comment;
  3. message_proc: the textual string of the comment after pre-processing;
  4. lang_label: categorical value for the language of the comment (Tagalog (Filipino), Cebuano, English, Taglish, Bislog, Bislish, Trilingual or Other);
  5. from_name:  identifier of public pages (not profiles of individuals) leaving comments (NaN for profiles of individuals, 'NAME' for public pages besides the newspapers, otherwise, the page name of the newspaper);
  6. created_time: Facebook Graph API's-generated string for the date and time the comment was posted;   
  7. month_year: categorical value in the form string+YY (e.g. Jun-15) of the month and year when the comment was posted;
  8. year: numerical value in the form YY; 
  9. newspaper: categorical value for the newspaper Facebook page under which the comment was found;
  10. corpus: categorical value for comments from the main corpus or the side (control) corpus;
  11. administration: categorical value for political administration (pbsa = President Benigno Aquino III, prrd = President Rodrigo Roa Duterte);
  12. count: numerical value referring to the number of string sequences without spaces;

The dataset may only be used for non-commercial purposes and is licensed under the CC BY-NC-SA 4.0 DEED.

_____________________________________________________________________________________

V2 - 05/06/2024

Corrections

  • Corrections made to region to include Luzon, Visayas and Mindanao (as opposed to Mindanao, non-Mindanao);
  • Corrections made to administration coding. 

 

This dataset is described by: 

Cruz, F. A. (2024). A Multilingual Collection of Facebook Comments on the Moro Identity and Armed Conflict in the Southern Philippines. Journal of Open Humanities Data, 10(1), 41. DOI: https://doi.org/10.5334/johd.219

 

Bibiliography

Jünger, J., & Keyling, T. (2019). Facepager: An application for automated data retrieval on the web (4.5.3) [Computer software]. https://github.com/strohne/Facepager/
 
 

Files

MMB151719SOCMED_codebook.pdf

Files (9.4 MB)

Name Size Download all
md5:04c397c760007ddadacc5d2dc0111b58
405.6 kB Preview Download
md5:17e583c11fc00b434b0cf7b818ba0942
9.0 MB Preview Download
md5:018b74313307adf3d98ac795b2eeff06
1.4 kB Preview Download

Additional details

Related works

Is described by
Data paper: 10.5334/johd.219 (DOI)

Funding

University of the Philippines System

Dates

Other
2021-03-17
Start of Collection
Other
2023-06-03
End of Collection