Published January 29, 2020 | Version v1
Presentation Open

Getting the Lead out–Does New York City's childhood lead testing make statistical sense?

  • 1. Columbia University

Description

The US has dramatically reduced blood lead levels in children over the past 30 years and that effort continues. New York City (NYC) was an early adopter of lead reduction policies and that effort continues with laws that require all children be tested and with mandatory interventions for those tested blood levels (tbll) greater than 5mg/dL. But there is a statistically interesting story around how current blood level limits are set, the performance of common tests and how to apply common Bayes rule reasoning to publicly available data.

The data we have: We have high quality blood lead level (bll) tests applied nation wide (NHANES) for 5,000 children, we have NYC supplied data that provides counts for all children’s tested blood lead level, the number greater than 5mg/dL, 10mg/dL and 15/dL and claims of blood tests that widely vary from sources like FDA applications for blood testing equipment, actual studies of test performance and government testing standards.

The data we want: New York city recently dropped the threshold for intervention from 10mg/dL to 5mg/dL. It is an open question what the false positive rate is for these test thresholds with some research suggesting that it is as high as 70%. On the other extreme is an FDA applications for the LeadCare Plus testing device claim a standard deviation of .5 at the 5mg/dL which suggests a very low false positive rate…but that depends on the distribution of actual blls in the NYC population.

How we got the data we wanted: This is a simple application of Bayes rule: p(bll > 5t >5) = p(tbll > 5bll>5) p(bll>5)/p(tbll>5) where we don’t know p(bll>5) for NYC. NYC refused to release non-quantized data for tbll under FIOA requests, which if we had, would allow a fairly straightforward determination of false positive rates from tbll test evaluations. But we do have data for the US as a whole in non-quantized form.

The paper describes a process of model refinement staring with naive approaches and incrementally modifying our models to better suite NYC data. The final approach, subject to change as we do more work, is to fit national NHANES data with an exponential distribution, assume that similar distributions apply to NYC and recover a believable false positive rate across a range of reported blood test performance. Along the way we show an interesting simple use of the ‘integrate_ode_rk45’ function in Stan and demonstrate Bayesian workflow.

Files

auerbach_baldwin.pdf

Files (1.5 MB)

Name Size Download all
md5:30e649c85319f366a08eac500e8e6555
1.5 MB Preview Download