The file “MainFigs.R” generates all the main plots in the paper. All the files required by it are in the "data/" folder.

The file "prevalence_correction_model.R" uses the functions in "prevalence_functions.R" to estimate the corrected prevalence with the model-based approach described in the paper.

All prevalence correction calculations are within "prevalence_functions.R”.
"prevalence_correction_model.R" produces the .csv files "correction Manaus.csv" and "correction Sao Paulo.csv".

Prevalence estimates generated by the different corrections were hand merged to produce plot_df_prevalences - required for Fig 2.

Description of datasets and variable names:

convalescent_plasma_longitudinal.csv - cohort of convalescent plasma donors with PCR-confirmed COVID-19
donor_id - random ID identifiying repeated measures within single individual
abbott_result - result of Abbott IgG SARS-CoV-2 assay at a threshold of 1.4 S/C
abbott_sc - signal-to-cutoff reading on Abbott IgG SARS-CoV-2 assay
seropositive_at_baseline - whether the individual was seropositive at their baseline cohort visit
followup_visit - visit number in the cohort follow-up
days_post_symptoms - number of days following symptom onset that the blood sample was collected

sero_consolidated_weights.csv - all blood donation results from Manaus and São Paulo with associated survey weights
donmo - month of blood donation
donyr - year of blood donation
abbott_result - result of Abbott IgG SARS-CoV-2 test at a threshold of 1.4 S/C
abbott_sc - signal-to-cutoff (S/C) reading of Abbott IgG test
location - city of blood donation
abbott_result_alt - result of Abbott IgG SARS-CoV-2 test at a threshold of 0.4 S/C
weights - survey weights accounting for age and sex distribution of the sample compared to the resident population of Manaus and São Paulo
min_dt - lower bound of sampling window for given donation month (donmo)
max_dt - upper bound of sampling window for given donation month (donmo)
mid_adjust - mid point of sampling window

INFLUD-19-10-2020.csv - SIVEP-gripe dataset downloaded at https://covid.saude.gov.br/ on 19-10-2020. Data dictionary is also available at https://covid.saude.gov.br/.

plot_df_prevalences.csv - monthly seroprevalence estimates with the range of corrections described in variable “type”
location - city of blood donation
donmo - month of donation
prev - prevalence point estimate
ci_l - lower confidence interval
ci_u - upper confidence interval
min_dt - lower bound of sampling window
mid_adjust - mid-point of sampling window
type - type of prevalence correction applied

demography.csv - projected population size for São Paulo and Manaus in 2020, data are from http://doi.org/10.31406/relap2020.v14.i1.n26.6
age_low - lower bound of age bracket of age-sex stratum
age_high - upper bound of age bracket of age-sex stratum
region - city (Manaus or São Paulo)
sex - sex for age-sex stratum
date - year of population projection
population - population size

stand_deaths.csv - daily deaths for Manaus and São Paulo (taken from https://covid.saude.gov.br/) standardized by the direct method for age and sex using Brazilian population (taken from http://doi.org/10.31406/relap2020.v14.i1.n26.6) as the reference
DT_EVOLUCA - date of death
brazil_rate - standardized daily mortality
total_deaths - total daily deaths
location - city of residence

validation_data.csv - data for validation of the Abbott IgG SARS-CoV-2 serology assay
abbott_result - result of the Abbott IgG SARS-CoV-2 assay at a threshold of 1.4 S/C
abbott_sc - signal-to-cutoff value on the Abbott IgG SARS-CoV-2 assay
type - sample type 
order - variable required for ordering vizualisation within ggplot2

verity_ifr.csv - Esimates of IFR from Mainland China (taken from https://doi.org/10.1016/S1473-3099(20)30243-7)

