Published June 21, 2021 | Version v1.0.0
Software Open

lanl/pyDNTNK: Release v1.0.0

Description

pyDNTNK is a software package for applying non-negative Hierarchical Tensor decompositions such as Tensor train and Hierarchical Tucker decompositons in a distributed fashion to large datasets. It is built on top of pyDNMFk. Tensor train (TT) and Hierarchical Tucker(HT) are state-of-the-art tensor network introduced for factorization of high-dimensional tensors. These methods transform the initial high-dimensional tensor in a network of low dimensional tensors that requires only a linear storage. Many real-world data,such as, density, temperature, population, probability, etc., are non-negative and for an easy interpretation, the algorithms preserving non-negativity are preferred. Here, we introduce the distributed non-negative Hierarchical tensor decomposition tools and demonstrate their scalability and the compression on synthetic and real world big datasets.

 Features:

  • Utilization of MPI4py for distributed operation.
  • Distributed Reshaping and Unfolding operations with Zarr and Dask.
  • Distributed Hierarchical Tensor decompositions such as Tensor train and Hierarchical Tucker.
  • Ability to perform both standard SVD based and NMF based decompositions.
  • Scalability to Tensors of very high dimensions.
  • Automated rank estimation with SVD for each stage of tensor decomposition.
  • Distributed Pruning of zero row and zero columns of the data.

Files

lanl/pyDNTNK-v1.0.0.zip

Files (23.1 MB)

Name Size Download all
md5:430d3067df0eb85478cce07109976e25
23.1 MB Preview Download

Additional details

Related works