There is a newer version of the record available.

Published November 10, 2025 | Version v1
Dataset Open

United States forensic DNA databases: NDIS, SDIS, and FOIA datasets

Description

This record contains three datasets describing U.S. forensic DNA databases: an NDIS time series (2001–2025), a 2025 SDIS state cross-section, and 2018 FOIA demographic tables.

The National DNA Index System (NDIS) dataset is a longitudinal series reconstructed from archived FBI webpages, providing jurisdiction-level counts of offender, arrestee, and forensic profiles, along with the number of participating laboratories and investigations aided. The State DNA Index System (SDIS) dataset was compiled through a systematic search of state repositories and statutes conducted in 2025, capturing reported totals, disaggregated counts where available, arrestee-collection status, familial-search policies, and statutory references across all 50 states. The FOIA dataset contains demographic summaries from seven states released in 2018 (Murphy & Tong, 2020), standardized into a tidy long format with race and gender categories harmonized to U.S. Census classifications and provenance flags distinguishing reported from calculated values.

All three datasets are provided as machine-readable CSVs with accompanying data dictionaries and reproducibility notebooks. A provenance manifest documents raw source locations (e.g., Internet Archive snapshots for NDIS, state webpages for SDIS), ensuring transparency even where redistribution of source files is not possible. Figures used in the associated manuscript are also included to illustrate data structure and validation.

 

Note: The largest folder is the NDIS timeseries snapshots. This is zipped within the zipped folder (i.e. nested) so if you unzip the main folder, you will still have to unzip the ndis folder. This is to prevent opening over 10k files unwittingly if you do not immediately need this raw data. Please see the README.md for a description of the file and folder structure. Also the data is available in the GitHub repository and the accompanying Quarto website if you would like to browse without downloading.

Files

README.md

Files (41.0 MB)

Name Size Download all
md5:848385dab5b7eeb181c69a1eba9bf6ed
41.0 MB Preview Download
md5:54811b24ac7a20d3b6f92d95ec56dca9
14.4 kB Preview Download

Additional details

Additional titles

Subtitle
NDIS time series (2001–2025), SDIS cross-sections, and FOIA-based demographic data.

Related works

Is supplemented by
Computational notebook: https://github.com/lasisilab/PODFRIDGE-Databases (URL)

Software

Development Status
Active