Published January 26, 2016 | Version v1
Journal article Open

Multiple imputation of multiple multi-item scales when a full imputation model is infeasible

  • 1. Centre for Health Economics and Medicines Evaluation, Bangor University, Ardudwy, Normal Site, Holyhead Road, Bangor, Gwynedd, LL57 2PZ, UK
  • 2. MRC Clinical Trials Unit at UCL, Institute of Clinical Trials and Methodology, Aviation House, 125 Kingsway, London, WC2B 6NH, UK
  • 3. MRC Biostatistics Unit, Cambridge Institute of Public Health, Robinson Way, Cambridge, CB2 0SR, UK

Description

Background: Missing data in a large scale survey presents major challenges. We focus on performing multiple imputation by chained equations when data contain multiple incomplete multi-item scales. Recent authors have proposed imputing such data at the level of the individual item, but this can lead to infeasibly large imputation models.

Methods: We use data gathered from a large multinational survey, where analysis uses separate logistic regression models in each of nine country-specific data sets. In these data, applying multiple imputation by chained equations to the individual scale items is computationally infeasible. We propose an adaptation of multiple imputation by chained equations which imputes the individual scale items but reduces the number of variables in the imputation models by replacing most scale items with scale summary scores. We evaluate the feasibility of the proposed approach and compare it with a complete case analysis. We perform a simulation study to compare the proposed method with alternative approaches: we do this in a simplified setting to allow comparison with the full imputation model.

Results: For the case study, the proposed approach reduces the size of the prediction models from 134 predictors to a maximum of 72 and makes multiple imputation by chained equations computationally feasible. Distributions of imputed data are seen to be consistent with observed data. Results from the regression analysis with multiple imputation are similar to, but more precise than, results for complete case analysis; for the same regression models a 39 % reduction in the standard error is observed. The simulation shows that our proposed method can perform comparably against the alternatives.

Conclusions: By substantially reducing imputation model sizes, our adaptation makes multiple imputation feasible for large scale survey data with multiple multi-item scales. For the data considered, analysis of the multiply imputed data shows greater power and efficiency than complete case analysis. The adaptation of multiple imputation makes better use of available data and can yield substantively different results from simpler techniques.

Files

13104_2016_Article_1853.pdf

Files (3.6 MB)

Name Size Download all
md5:473a024886682bebc1817c773afafc92
13.0 kB Download
md5:1f1d86689ca99db0a58df979e93be118
15.5 kB Download
md5:b05a933532f6e85158d86fd4bb0f2657
3.5 MB Preview Download
md5:563801cc54d1857309c8b303312f1c3d
15.5 kB Download