Published July 24, 2024 | Version 1
Dataset Open

NCHS mortality data 2014-2022

  • 1. ROR icon Yale University

Description

This is a database (parquet format) containing publicly available multiple cause mortality data from the US (CDC/NCHS) for 2014-2022. Not all variables are included on this export. Please see below for restrictions on the use of these data imposed by NCHS. You can use the arrow package in R to open the file. See here for example analysis; https://github.com/DanWeinberger/pneumococcal_mortality/blob/main/analysis_nongeo.Rmd . For instance, save this file in a folder called "parquet3":

 

library(arrow)

library(dplyr)

pneumo.deaths.in <-  open_dataset("R:/parquet3", format = "parquet")  %>% #open the dataset
  filter(grepl("J13|A39|J181|A403|B953|G001", all_icd)) %>% #filter to records that have the selected ICD codes
 collect()  #call the dataset into memory. Note you should do any operations you canbefore calling 'collect()" due to memory issues

 

The variables included are named: (see full dictionary:https://www.cdc.gov/nchs/nvss/mortality_public_use_data.htm)

year: Calendar year of death

month: Calendar month of death

age_detail_number: number indicating year or part of year; can't be interpreted itself here. see agey variable instead

sex: M/F

place_of_death:

Place of Death and Decedent’s Status
Place of Death and Decedent’s Status
1 ... Hospital, Clinic or Medical Center
 - Inpatient
2 ... Hospital, Clinic or Medical Center
 - Outpatient or admitted to Emergency Room
3 ... Hospital, Clinic or Medical Center
 - Dead on Arrival
4 ... Decedent’s home
5 ... Hospice facility
6 ... Nursing home/long term care
7 ... Other
9 ... Place of death unknown 

all_icd: Cause of death coded as ICD10 codes. ICD1-ICD21 pasted into a single string, with separation of codes by an underscore 

hisp_recode: 0=Non-Hispanic; 1=Hispanic; 999= Not specified

race_recode: race coding prior to 2018 (reconciled in race_recode_new)

race_recode_alt:  race coding after 2018 (reconciled in race_recode_new)

race_recode_new:

  1='White'

  2= 'Black'

  3='Hispanic'

  4='American Indian'

  5='Asian/Pacific Islanders'

agey:

  age in years (or partial years for kids <12months)

 

 

https://www.cdc.gov/nchs/data_access/restrictions.htm

Please Read Carefully Before Using NCHS Public Use Survey Data

The National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention (CDC), conducts statistical and epidemiological activities under the authority granted by the Public Health Service Act (42 U.S.C. § 242k). NCHS survey data are protected by Federal confidentiality laws including Section 308(d) Public Health Service Act [42 U.S.C. 242m(d)] and the Confidential Information Protection and Statistical Efficiency Act or CIPSEA [Pub. L. No. 115-435, 132 Stat. 5529 § 302]. These confidentiality laws state the data collected by NCHS may be used only for statistical reporting and analysis. Any effort to determine the identity of individuals and establishments violates the assurances of confidentiality provided by federal law.

 

Terms and Conditions

NCHS does all it can to assure that the identity of individuals and establishments cannot be disclosed. All direct identifiers, as well as any characteristics that might lead to identification, are omitted from the dataset. Any intentional identification or disclosure of an individual or establishment violates the assurances of confidentiality given to the providers of the information. Therefore, users will:

  1. Use the data in this dataset for statistical reporting and analysis only.
  1. Make no attempt to learn the identity of any person or establishment included in these data.
  1. Not link this dataset with individually identifiable data from other NCHS or non-NCHS datasets.
  1. Not engage in any efforts to assess disclosure methodologies applied to protect individuals and establishments or any research on methods of re-identification of individuals and establishments.

By using these data you signify your agreement to comply with the above-stated statutorily based requirements.

 

Sanctions for Violating NCHS Data Use Agreement

Willfully disclosing any information that could identify a person or establishment in any manner to a person or agency not entitled to receive it, shall be guilty of a class E felony and imprisoned for not more than 5 years, or fined not more than $250,000, or both.

Files

Files (431.7 MB)

Name Size Download all
md5:15f30dea922e4bdb3439bfc14e2229b6
431.7 MB Download

Additional details

Software

Repository URL
https://github.com/DanWeinberger/pneumococcal_mortality
Programming language
R