BraCID: Brazilian Cultural Identity Information Through Reading Preferences
Creators
- 1. Universidade Federal de Minas Gerais
Description
In Brazil, each region has its own cultural identity regarding accent, gastronomy, traditions, all of which may reflect its literature. Specially, we believe that country's background and contextual features are directly related to what people read. Hence, we present an enhanced dataset that comprises cultural, geographic, and socioeconomic information to explore Brazilian cultural identity through reading preference.
As our main data source, we chose the Goodreads website due to the sheer volume of data available and its organized and easily accessible API. We collect data from Brazilian readers through the goodreads library, which provides a Python interface to the Goodreads API. Specifically, we collect members of two of the largest Brazilian reading groups: the "Clube de Leitores em Português" (4,229 members) and the "Goodreads Brasil" (3,222 members). For all members of both groups, we also collect data from their friends. Then, we filter only those containing Brazil as location information from the final users' set. Finally, with the same library, we gather users' bookshelves to assess their reading preferences.
To investigate the Brazilian reading identity, we consider a medley of demographic and socioeconomic data from the Brazilian Institute of Geography and Statistics (IBGE): including territorial area, population estimate, demographic density, Human Development Index (HDI), Gross Domestic Product (GDP), and monthly household income per capita. All indicators refer to the year 2020, except the HDI that refers to the year 2017. The data collection was carried out from February 23 to March 04, 2021.
Our final dataset, named as BraCID, comprises:
- 38,231 Brazilian Goodreads users
- 75,093 Distinct books
- 80 Literary genres
- 6 IBGE indicators regarding the 27 federative units of Brazil
Files
books.csv
Files
(90.7 MB)
Name | Size | Download all |
---|---|---|
md5:53da9865a059c9f693743bb8254ca2b4
|
85.1 MB | Preview Download |
md5:ab664233fbb7aa2a037ba58583a3c76f
|
19.5 kB | Preview Download |
md5:a4d4dc48ecf1d22fd74eb77b0cec3999
|
12.0 kB | Preview Download |
md5:18e82afbeb83f13e7aaf234d1f75a2d1
|
5.5 MB | Preview Download |
md5:9d32041d69a21cd947d155b418d8734d
|
62.8 kB | Preview Download |
md5:5a7e3b0ef03a780c90006f39ecd5b2ea
|
3.1 kB | Preview Download |
md5:c6ca4f8ba79a1ea092582b640757e06e
|
1.6 kB | Preview Download |
md5:c238799501b36cfae821c19043707b42
|
2.3 kB | Preview Download |