Published June 2, 2021 | Version v1
Dataset Open

BraCID: Brazilian Cultural Identity Information Through Reading Preferences

Description

In Brazil, each region has its own cultural identity regarding accent, gastronomy, traditions, all of which may reflect its literature. Specially, we believe that country's background and contextual features are directly related to what people read. Hence, we present an enhanced dataset that comprises cultural, geographic, and socioeconomic information to explore Brazilian cultural identity through reading preference. 

As our main data source, we chose the Goodreads website due to the sheer volume of data available and its organized and easily accessible API. We collect data from Brazilian readers through the goodreads library, which provides a Python interface to the Goodreads API. Specifically, we collect members of two of the largest Brazilian reading groups: the "Clube de Leitores em Português" (4,229 members) and the "Goodreads Brasil" (3,222 members). For all members of both groups, we also collect data from their friends. Then, we filter only those containing Brazil as location information from the final users' set. Finally, with the same library, we gather users' bookshelves to assess their reading preferences.

To investigate the Brazilian reading identity, we consider a medley of demographic and socioeconomic data from the Brazilian Institute of Geography and Statistics (IBGE): including territorial area, population estimate, demographic density, Human Development Index (HDI), Gross Domestic Product (GDP), and monthly household income per capita. All indicators refer to the year 2020, except the HDI that refers to the year 2017. The data collection was carried out from February 23 to March 04, 2021.

Our final dataset, named as BraCID, comprises:

  • 38,231 Brazilian Goodreads users
  • 75,093 Distinct books
  • 80 Literary genres
  • 6 IBGE indicators regarding the 27 federative units of Brazil

Files

books.csv

Files (90.7 MB)

Name Size Download all
md5:53da9865a059c9f693743bb8254ca2b4
85.1 MB Preview Download
md5:ab664233fbb7aa2a037ba58583a3c76f
19.5 kB Preview Download
md5:a4d4dc48ecf1d22fd74eb77b0cec3999
12.0 kB Preview Download
md5:18e82afbeb83f13e7aaf234d1f75a2d1
5.5 MB Preview Download
md5:9d32041d69a21cd947d155b418d8734d
62.8 kB Preview Download
md5:5a7e3b0ef03a780c90006f39ecd5b2ea
3.1 kB Preview Download
md5:c6ca4f8ba79a1ea092582b640757e06e
1.6 kB Preview Download
md5:c238799501b36cfae821c19043707b42
2.3 kB Preview Download