This README.txt file was created on 2022-07-09 by Kyle Colonna GENERAL INFORMATION 1. Title of Software: “Software From: A Retrospective Assessment of COVID-19 Model Performance in the US” 2. Author Information Corresponding Author Name: Kyle J. Colonna Institution: Harvard T.H. Chan School of Public Health Email: kcolonna@g.harvard.edu Co-author 1 Name: Gabriela F. Nane Institution: Delft University of Technology Co-author 2 Name: Ernani F. Choma Institution: Harvard T.H. Chan School of Public Health Co-author 3 Name: Roger M. Cooke Institution: Delft University of Technology; Resources for the Future Co-author 4 Name: John S. Evans Institution: Harvard T.H. Chan School of Public Health 3. Date of data collection: 2022 4. Funding sources: N/A 5. Recommended citation for this software: Colonna, Kyle J. et al. (2022), Software From: A Retrospective Assessment of COVID-19 Model Performance in the US, Zenodo, Software, https://doi.org/10.5281/zenodo.6799698) SOFTWARE & FILE OVERVIEW 1. Description of Software All weekly mortality forecast/observation data were collected from the COVID-19 Forecast Hub's publicly available data repository (https://doi.org/10.5281/zenodo.6301718). State population data from the United States Census Bureau is also required to generate Figure 1 (https://www2.census.gov/programs-surveys/popest/datasets/2020-2021/state/totals/NST-EST2021-alldata.csv). Instructions for downloading the required data are provided in 'Read_Process_Data_070222.R'. This data was read and processed using the included code ('Read_Process_Data_070222.R'), which also produced Figure 1 in our manuscript. The following COVID-19 forecasting models were included in our analysis: BPagano-RtDriven https://bobpagano.com/covid-19-modeling/ CovidAnalytics-DELPHI https://www.covidanalytics.io/DELPHI_documentation_pdf COVIDhub-baseline https://covid19forecasthub.org/ COVIDhub_CDC-ensemble https://www.cdc.gov/coronavirus/2019-ncov/science/forecasting/mathematical-modeling.html CU-nochange https://doi.org/10.1101/2020.03.21.20040303 CU-scenario_low https://doi.org/10.1101/2020.03.21.20040303 CU-scenario_mid https://doi.org/10.1101/2020.03.21.20040303 CU-select https://doi.org/10.1101/2020.03.21.20040303 https://www.medrxiv.org/content/10.1101/2020.05.04.20090670v2 DDS-NBDS https://dds-covid19.github.io/ epiforecasts-ensemble1 https://doi.org/10.12688/wellcomeopenres.16006.1 GT-DeepCOVID https://ojs.aaai.org/index.php/AAAI/article/view/17808 IHME-CurveFit https://www.medrxiv.org/content/10.1101/2020.03.27.20043752v1 JHU_CSSE-DECOM https://systems.jhu.edu/research/public-health/predicting-covid-19-risk/ JHUAPL-Bucky https://github.com/mattkinsey/bucky Karlen-pypm https://arxiv.org/abs/2007.07156 MIT_CritData-GBCF https://github.com/sakethsundar/covid-forecaster MOBS-GLEAM_COVID https://uploads-ssl.webflow.com/58e6558acc00ee8e4536c1f5/5e8bab44f5baae4c1c2a75d2_GLEAM_web.pdf PSI-DRAFT https://github.com/reichlab/covid19-forecast-hub/tree/master/data-processed/PSI-DRAFT RobertWalraven-ESG http://rwalraven.com/COVID19 SteveMcConnell-CovidComplete https://stevemcconnell.com/covid UCSD_NEU-DeepGLEAM https://datascience.ucsd.edu/COVID19/ UMass-MechBayes https://github.com/dsheldon/covid USC-SI_kJalpha https://arxiv.org/abs/2007.05180 We share the list of forecast locations and dates (i.e., questions) that were part of the analysis in the file 'Question_List.csv' (which can be generated by running the code in 'Read_Process_Data_070222.R'). The resulting dataset was then analyzed for predictive and probabilistic performance using the included code 'CM main 070822.R'. Other included functions ('calibrationScore1.R'; 'informationScore1.R'; 'constructDM1.R'; and 'globalWeights_opt1.R') are necessary to run this performance assessment code. For more information on the performance criteria, forecast data, observation data, and model selection process, please see the main text of our manuscript. 2. File List: calibrationScore1.R CM main 070822.R constructDM1.R globalWeights_opt1.R informationScore1.R Question_List.csv Read_Process_Data_070222.R 3. Programming Software All analyses were conducted in R. Data was processed and read in R v. 3.5.1 and the predictive and probabilistic performance analysis was conducted in R v. 3.6.2. 4. Running the code The analysis can be replicated by running main code file ‘CM main 070822.R’. 4.1. Inputs: Running it requires... 4.1.1. library files ’calibrationScore1.R’, ‘constructDM1.R’, ‘globalWeights_opt1.R’, and ‘informationScore1.R’. All of them are provided with the code. 4.1.2. The following input files are generated by Read_Process_Data_070222.R. Note that this code reads and processes data for experts 12, 13, and 15; but we do not include experts 13 and 15 from the analysis, and also remove expert 12 when constructing the decision makers. 1) realizations.csv This is a dataset with two columns. The columns are: Column 1 (variable named ‘LabelQuestion’): Question number Column 2 (variable named ‘Realization’): The truth data for each question Each row represents one question – in our case, the dataset contains 1,572 rows. 2) Question_List.csv (also provided with the code) This is a dataset with five columns. The columns are: Column 1 (variable named ‘LabelQuestion’): Question number Column 2 (variable named ‘target_end_date’): The last day of the forecast target week Column 3 (variable named ‘location’): FIPS code for each forecast location (i.e., U.S. state or District of Columbia) Column 4 (variable named ‘location_name’): Name for each forecast location (i.e., U.S. state or District of Columbia) Column 5 (variable named ‘Realization’): The truth data for each question Each row represents one question – in our case, the dataset contains 1,572 rows. 3) For each Model/Expert, a file named Exp_[xx].csv, where [xx] is a two-digit number assigned to each expert. This is a dataset with seven columns. One dataset is required for each Model (Expert) The columns are: Column 1 (variable named ‘Expert_ID’): A number serving as the ID for each expert Column 2 (variable named ‘LabelQuestion’): Question number Column 3 (variable named ‘Percent_5’): Model/Expert prediction -- the 5th quantile Column 4 (variable named ‘Percent_25’): Model/Expert prediction -- the 25th quantile Column 5 (variable named ‘Percent_50’): Model/Expert prediction -- the 50th quantile Column 6 (variable named ‘Percent_75’): Model/Expert prediction -- the 75th quantile Column 7 (variable named ‘Percent_95’): Model/Expert prediction -- the 95th quantile Each row represents one question – in our case, the dataset contains 1,572 rows. 4.2. Outputs: The following output files are generated by 'CM main 070822.R'. Note that to obtain these output files, you will need to use code write.csv. Accuracy and precision scores of the decision makers (i.e., DMs) were generated using Excel and are not provided here. For more information on how to calculate accuracy and precision scores, please see our main text. 1) accuracy_models.csv This is a dataset with 24 columns. The columns are: Column 1: States in alphabetical order, the last row is U.S. national Columns 2 - 24: Experts in order of increasing number and their performance scores Each row represents one state or the nation - in our case, the dataset contains 50 rows 2) calscores_all.csv This is a dataset with 25 columns. The columns are: Column 1: States in alphabetical order, the last row is U.S. national Column 2: Number of dates (i.e., questions) that were part of the analysis Columns 3 - 25: Experts in order of increasing number and their performance scores Each row represents one state or the nation - in our case, the dataset contains 50 rows 3) calscores_all_DM.csv This is a dataset with 3 columns. The columns are: Column 1: States in alphabetical order, the last row is U.S. national Column 2 (variable named 'GWDM opt'): The performance scores of the globally weighted decision maker (a.k.a., the Classical Model's optimized performance-weighted ensemble as referred to in the text) Column 3 (variable named 'EWDM'): The performance scores of the equally weighted decision maker (a.k.a., the constructed equal-weighted ensemble as referred to in the text) 4) infoscores_all.csv This is a dataset with 24 columns. The columns are: Column 1: States in alphabetical order, the last row is U.S. national Columns 2 - 24: Experts in order of increasing number and their performance scores Each row represents one state or the nation - in our case, the dataset contains 50 rows 5) infoscores_all_DM.csv This is a dataset with 3 columns. The columns are: Column 1: States in alphabetical order, the last row is U.S. national Column 2 (variable named 'GWDM opt'): The performance scores of the globally weighted decision maker (a.k.a., the Classical Model's optimized performance-weighted ensemble as referred to in the text) Column 3 (variable named 'EWDM'): The performance scores of the equally weighted decision maker (a.k.a., the constructed equal-weighted ensemble as referred to in the text) 6) precision_models.csv This is a dataset with 24 columns. The columns are: Column 1: States in alphabetical order, the last row is U.S. national Columns 2 - 24: Experts in order of increasing number and their performance scores Each row represents one state or the nation - in our case, the dataset contains 50 rows