Published May 5, 2021 | Version v1.1.0
Software Open

lanl/pyDNMFk: version 1.1.0

Description

pyDNMFk is a software package for applying non-negative matrix factorization in a distrubuted fashion to large datasets. It has the ability to minimize the difference between reconstructed data and the original data through various norms (Frobenious, KL-divergence). Additionally, the Custom Clustering algorithm allows for automated determination for the number of Latent features. The software features following capabilities:

  • Utilization of MPI4py for distributed operation.
  • Distributed NNSVD and SVD initiaizations.
  • Distributed Custom Clustering algorithm for estimating automated latent feature number (k) determination.
  • Objective of minimization of KL divergence/Frobenius norm.
  • Optimization with multiplicative updates, BCD, and HALS.
  • Checkpoints for tracking runtime status enabling restart from saved state.
  • Distributed Pruning of zero row and zero columns of the data.

Files

lanl/pyDNMFk-v1.1.0.zip

Files (11.2 MB)

Name Size Download all
md5:25aba590746c03ce10678f136b1570b4
11.2 MB Preview Download

Additional details

Related works

References

  • Bhattarai, Manish, et al. "Distributed Non-Negative Tensor Train Decomposition." 2020 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2020.