Published January 26, 2023 | Version 1.1.0
Other Open

BY-COVID - WP5 - Baseline Use Case: SARS-CoV-2 vaccine effectiveness assessment - Common Data Model Specification

  • 1. Institute for Health Science in Aragon (IACS)
  • 2. Sciensano
  • 1. DANS
  • 2. GÖG
  • 3. Instute for Health Science in Aragon (IACS)

Description

This publication corresponds to the Common Data Model (CDM) specification of the Baseline Use Case proposed in T.5.2 (WP5) in the BY-COVID project on “SARS-CoV-2 Vaccine(s) effectiveness in preventing SARS-CoV-2 infection.

Research Question: “How effective have the SARS-CoV-2 vaccination programmes been in preventing SARS-CoV-2 infections?

  • Intervention (exposure): COVID-19 vaccine(s) 
  • Outcome: SARS-CoV-2 infection 
  • Subgroup analysis: Vaccination schedule (type of vaccine)

Study Design: An observational retrospective longitudinal study to assess the effectiveness of the SARS-CoV-2 vaccine in preventing SARS-CoV-2 infections using routinely collected social, health and care data from several countries.

A causal model was established using Directed Acyclic Graphs (DAGs) to map domain knowledge, theories and assumptions about the causal relationship between exposure and outcome. The DAG developed for the research question of interest is shown below.

Cohort definition: All people eligible to be vaccinated (from 5 to 115 years old, included) or with, at least, one dose of a SARS-CoV-2 vaccine (any of the available brands) having or not a previous SARS-CoV-2 infection.

  • Inclusion criteria: All people vaccinated with at least one dose of the COVID-19 vaccine (any available brands) in an area of residence. Any person eligible to be vaccinated (from 5 to 115 years old, included) with a positive diagnosis (irrespective of the type of test) for SARS-CoV-2 infection (COVID-19) during the period of study.
  • Exclusion criteria: People not eligible for the vaccine (from 0 to 4 years old, included)
  • Study period: From the date of the first documented SARS-CoV-2 infection in each country to the most recent date in which data is available at the time of analysis. Roughly from 01-03-2020 to 30-06-2022, depending on the country. 

Files included in this publication: 

  • Causal model (responding to the research question) 
    • SARS-CoV-2 vaccine effectiveness causal model v.1.0.0 (HTML) - Interactive report showcasing the structural causal model (DAG) to answer the research question
    • SARS-CoV-2 vaccine effectiveness causal model v.1.0.0 (QMD) - Quarto RMarkdown script to produce the structural causal model 
  • Common data model specification (following the causal model)
    • SARS-CoV-2 vaccine effectiveness data model specification (XLXS) - Human-readable version (Excel)
    • SARS-CoV-2 vaccine effectiveness data model specification dataspice (HTML) - Human-readable version (interactive report)
    • SARS-CoV-2 vaccine effectiveness data model specification dataspice (JSON) - Machine-readable version
  • Synthetic dataset (complying with the common data model specifications)
    • SARS-CoV-2 vaccine effectiveness synthetic dataset (CSV) [UTF-8, pipe | separated, N~650,000 registries]
    • SARS-CoV-2 vaccine effectiveness synthetic dataset EDA  (HTML) - Interactive report of the exploratory data analysis (EDA) of the synthetic dataset
    • SARS-CoV-2 vaccine effectiveness synthetic dataset EDA  (JSON) - Machine-readable version of the exploratory data analysis (EDA) of the synthetic dataset
    • SARS-CoV-2 vaccine effectiveness synthetic dataset generation script (IPYNB) - Jupyter notebook with Python scripting and commenting to generate the synthetic dataset 

#### Baseline Use Case: SARS-CoV-2 vaccine effectiveness assessment - Common Data Model Specification v.1.1.0  change log ####

  • Updated Causal model to eliminate the consideration of 'vaccination_schedule_cd' as a mediator 
  • Adjusted the study period to be consistent with the Study Protocol
  • Updated 'sex_cd' as a required variable
  • Added 'chronic_liver_disease_bl' as a comorbidity at the individual level
  • Updated 'socecon_lvl_cd' at the area level as a recommended variable
  • Added crosswalks for the definition of 'chronic_liver_disease_bl' in a separate sheet
  • Updated the 'vaccination_schedule_cd' reference to the 'Vaccine' node in the updated DAG
  • Updated the description of the 'confirmed_case_dt' and 'previous_infection_dt' variables to clarify the definition and the need for a single registry per person

 

 

Notes

The scripts (software) accompanying the data model specification are offered "as-is" without warranty and disclaiming liability for damages resulting from using it. The software is released under the CC-BY-4.0 licence, which permits you to use the content for almost any purpose (but does not grant you any trademark permissions), so long as you note the license and give credit.

Files

baseline_use_case_data_model_specification_v.1.1.0.zip

Files (35.8 MB)

Additional details

Related works

Is required by
Proposal: 10.5281/zenodo.7560731 (DOI)

Funding

BY-COVID – Beyond COVID 101046203
European Commission