Published September 9, 2024 | Version 0.0.2
Software Open

Diffusion-based kernel density estimator (diffKDE)

  • 1. GEOMAR - Helmholtz Centre for Ocean Research Kiel, Kiel University, Germany
  • 2. Kiel University, Germany

Description

The diffKDE package includes a new algorithm of a diffusion-based kernel density estimator (diffKDE) (Chaudhuri ad Marron, 2000; Botev et al., 2010) as a Python tool. It provides a function to calculate the diffKDE from 1-dimensional data as the solution of the diffusion equation with Neumann boundary conditions and an initial value constructed from the delta-distribution of the input data. The implementation is based on an equidistant finite differences discretization in space and time, two pilot estimation steps, and a new approximation of the optimal bandwidth. For the diffKDE, the bandwidth parameter equals the positive square root of the final iteration time. The delta distribution in the initial value is approximated by a Dirac sequence (Hirsch and Lacombe, 1999.). The pilot estimation steps are simplified diffKDEs theirselves and incorporated in the approximation of the optimal bandwidth. One pilot additionally serves as a parameter function in the diffusion equation, as suggested by Botev et al. (2010). The bandwidths for the pilot estimates are data-driven approaches by Silverman (1986). The optimal bandwidth for the diffKDE is a direct approximation of its analytical optimal solution using the second pilot as an approximation of the true probability density. Furthermore, the package provides functions for visual outputs of the first pilot estimate, the time evolution of the solution for the diffKDE and an interactive exploration of the different smoothing grades of the diffKDE at different bandwidths. The last one can be used to identify individual bandwidths for specific purposes. The diffKDE function requires a 1-dimensional data set. Optional parameters are lower and upper spatial boundaries, numbers of spatial and temporal discretization intervals, and a fixed final iteration time.

Files

diffKDE.zip

Files (50.9 kB)

Name Size Download all
md5:b7be31b24fa7a565c0820e0da3ae7420
50.9 kB Preview Download

Additional details

Software

Programming language
Python

References

  • P Chaudhuri and JS Marron. "Scale space view of curve estimation". In: The annals of Statistics 28.2 (Apr. 2000), pp. 408-428. ISSN: 0090-5364. DOI: 10.1214/aos/1016218224.
  • ZI Botev, JF Grotowski, and DP Kroese. "Kernel density estimation via diffusion". In: The annals of Statistics 38.5 (2010), pp. 2916-2957. DOI: 10.1214/10-AOS799
  • B Silverman: Density estimation, Monographs on Statistics and Applied Probability, 1986.
  • F Hirsch and G Lacombe.: Elements of Functional Analysis, Springer, 1999.