ungroup: Penalized Composite Link Model for Efficient Estimation of Smooth Distributions from Coarsely Binned Data
Marius D. Pascariu;
Maciej J. Dańko
Versatile method for ungrouping histograms (binned count data) assuming that counts are Poisson distributed and that the underlying sequence on a fine grid to be estimated is smooth. The method is based on the composite link model and estimation is achieved by maximizing a penalized likelihood. Smooth detailed sequences of counts and rates are so estimated from the binned counts. Ungrouping binned data can be desirable for many reasons: Bins can be too coarse to allow for accurate analysis; comparisons can be hindered when different grouping approaches are used in different histograms; and the last interval is often wide and open-ended and, thus, covers a lot of information in the tail area. Age-at-death distributions grouped in age classes and abridged life tables are examples of binned data. Because of modest assumptions, the approach is suitable for many demographic and epidemiological applications. For a detailed description of the method and applications see Rizzi et al. (2015) <doi:10.1093/aje/kwv020>.
Currie ID, Durban M, Eilers PH (2004). "Smoothing and forecasting mortality rates." Statistical modelling, 4(4), 279–298.
Eilers PH (2007). "Ill-posed problems with counts, the composite link model and penalized likelihood." Statistical Modelling, 7(3), 239-254. doi: 10.1177/1471082X0700700302.
Hastie TJ, Tibshirani RJ (1990). "Generalized additive models." Monographs on Statistics and Applied Probability, 43.
Human Mortality Database (2018). "University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). Data downloaded on 17/01/2018." https://www.mortality.org.
Pascariu MD (2018). MortalityLaws: Parametric Mortality Models, Life Tables and HMD. R package version 1.6.0, https://github.com/mpascariu/MortalityLaws.
Rizzi S, Gampe J, Eilers PHC (2015). "Efficient Estimation of Smooth Distributions From Coarsely Grouped Data." American Journal of Epidemiology, 182(2), 138-147. doi: 10.1093/aje/kwv020.
Rizzi S, Halekoh U, Thinggaard M, Engholm G, Christensen N, Johannesen TB, Lindahl-Jacobsen R (Forthcoming). "How to estimate mortality trends from grouped vital statistics." International Journal of Epidemiology.
Rizzi S, Thinggaard M, Engholm G, Christensen N, Johannesen TB, Vaupel JW, Lindahl-Jacobsen R (2016). "Comparison of non-parametric methods for ungrouping coarsely aggregated data." BMC medical research methodology, 16(1), 59. doi: 10.1186/s12874-016-0157-8.
Thompson R, Baker R (1981). "Composite link functions in generalized linear models." Applied Statistics, 125–131.