Published February 2, 2022 | Version v1.1
Dataset Open

Infection Delay Project Data

  • 1. Oxford Internet Institute, University of Oxford
  • 2. Institute for Applied Economic Research, Brazil
  • 3. Applied Mathematics Department, University of São Paulo
  • 4. Department of Computer Science, University of Exeter

Description

GitHub Repository for the project code can be found here: https://github.com/shivyucel/infection-delay-project

The raw data folder contains the census and commuting data, along with their relevant shapefiles and labels. It also contains the shapefiles for the 2599 hexagons used in this analysis.

The prelim_data folder contains the h3_IDs table, used to numerically identify hexagons (more info included in info file in folder). It also includes the commuting data, already preprocessed using code in the Effective Distance/SIR Model folder to filter out regions which do not have suitable data. 

The SIR_model_inputs folder has the mobility matrix based on the commuting data, and the population data (both already filtered to include suitable regions, same as above).

The infection_delay_inputs folder contains the baseline and real mobility effective distance matrices. 

The mobility_data folder contains the raw mobility data from March - September 2020 (raw_mobility_data.csv), the mobility reductions across all hexagons with data around the lockdown (mobility_reduction.csv), and the mobility change (filtered_mobility_reduction.csv) used in the effective_distance_real_changes.py code, filtered for the regions which are suitable for the effective distance/SIR calculations. 

The result_data folder contains the tables used to generate the results, including the weighted median 10-day infection delay values for every region, the outbreak divided results, merged with income and centrality data. The final data files are 'weighted_hexagon_data.csv' and 'longford_outbreak_split_delays'.

NOTE: The infection delay files take up 30+ GB of storage, and are not included here. In both mobility scenario, 2599 tables are generated, each with 2599 columns, representing the time series of every hexagon for every outbreak scenario. These are then duplicated in reorient_infection_delay_tables.py, to make them amenable to analysis.

Full commuting and census data, beyond that included in the raw data folder, can be found here (https://www.metro.sp.gov.br/pesquisa-od/) and here (https://censo2010.ibge.gov.br/resultados/resumo.html), respectively.

 

Files

data_files.zip

Files (248.9 MB)

Name Size Download all
md5:f3260da4ba6581615786d5c52b84a3ef
248.9 MB Preview Download