CDC Social Vulnerability Index across time and space
Description
Social Vulnerability Index (SVI) Multi-Geographic Level Dataset 2012-2021
Description
This dataset provides a comprehensive compilation of Social Vulnerability Index (SVI) calculations across multiple geographic levels in the United States from 2012 to 2021. The data was generated using the findSVI R package, which extends the CDC/ATSDR SVI methodology to additional geographic levels and more recent American Community Survey (ACS) data. This archive serves as a crucial preservation of social vulnerability metrics, ensuring continued access to this important public health planning tool.
Data Coverage
Temporal Coverage
- Years: 2012-2021
- Annual data available for all geographic levels
Geographic Coverage
- Geographic levels included:
- CBSA (Core Based Statistical Areas)
- Combined Statistical Areas
- Congressional Districts
- Counties
- County Subdivisions
- Metropolitan/Micropolitan Statistical Areas
- Places
- Public Use Microdata Areas (PUMAs)
- States
- State Legislative Districts (Upper and Lower Chambers)
- Census Tracts
Variables
Core Variables
- GEOID: Federal Information Processing Standards (FIPS) code uniquely identifying geographic areas
- RPL_theme1: Socioeconomic Status theme percentile ranking
- RPL_theme2: Household Composition theme percentile ranking
- RPL_theme3: Minority Status & Language theme percentile ranking
- RPL_theme4: Housing Type & Transportation theme percentile ranking
- RPL_themes: Overall Social Vulnerability Index percentile ranking
- year: Year of the SVI data (2012-2021)
- state: State identifier (two-letter code or 'US' for national level)
- geo: Geographic level identifier
Methodology
The dataset was created using the findSVI R package, which:
- Retrieves 16 demographic variables from the American Community Survey
- Calculates percentile rankings for each variable
- Aggregates rankings into four themes: socioeconomic status, household composition, minority status/language, and housing/transportation
- Generates overall SVI scores through percentile calculations
Technical Details
- File Format: Parquet and CSV (identical data but recomended to use the parquet file as geographic identifiers are type casted as chracter and prevent cutoff of lead `0's` common to census geographic identifiers)
- Data Structure: Long format with consistent variable names across geographic levels
- Missing Data: Noted for geographic units lacking component variables
- Special Cases:
- DC and Nebraska have specific handling for legislative districts
- Some geographies don't require state arguments in calculations
Usage Notes
- Data can be filtered by geographic level and year for specific analysis needs
- Percentile rankings are calculated relative to specified geographic reference areas
- Some geographic units may have missing values due to incomplete ACS data
- Users should consider the appropriate geographic level for their specific research needs
Citation
When using this dataset, please cite:
- The findSVI R package: Xu, H., Li, R., & Bilal, U. (2024). findSVI: an R package to calculate the Social Vulnerability Index at multiple geographical levels. Journal of Open Source Software, 9(99), 6525.
- The original CDC/ATSDR SVI methodology: Flanagan, B. E., Gregory, E. W., Hallisey, E. J., Heitgerd, J. L., & Lewis, B. (2011). A social vulnerability index for disaster management. Journal of Homeland Security and Emergency Management, 8(1).
Access
The dataset is available for public use and can be accessed through either through findSVI or these archived files at this zenodo page.
Files
svi_across_time_space_backup.csv
Additional details
Related works
- Is derived from
- Software: 10.21105/joss.06525 (DOI)
Software
- Repository URL
- https://github.com/heli-xu/findSVI
- Programming language
- R
- Development Status
- Active