Published November 24, 2024 | Version v3
Preprint Open

A Scalable Approach for Mapper via Efficient Spatial Search

Authors/Creators

Description

Topological Data Analysis (TDA) is a branch of applied mathematics that studies the shape of high dimensional datasets using ideas from algebraic topology. The Mapper algorithm is a widely used tool in Topological Data Analysis, used for uncovering hidden structures in complex data. However, existing implementations often rely on naive and inefficient methods for constructing the open covers that Mapper is based on, leading to performance issues, especially with large, high-dimensional datasets. In this study, we introduce a novel, more scalable method for constructing open covers for Mapper, leveraging techniques from computational geometry. Our approach significantly enhances efficiency, improving Mapper’s performance for large high-dimensional data. We will present theoretical insights into our method and demonstrate its effectiveness through experimental evaluations on well-known datasets, showcasing substantial improvements in running time compared to existing approaches. We implemented our method in a new Python library called tda-mapper, which is freely available at https://github.com/lucasimi/tda-mapper-python, providing a powerful tool for TDA practitioners and researchers.

Files

preprint.pdf

Files (7.7 MB)

Name Size Download all
md5:4b8181bff83698d0be48b20f36e156a7
2.6 MB Preview Download
md5:4caa11e013ea19a525295aae8c228765
5.1 MB Preview Download

Additional details

Related works

Is supplement to
Software: 10.5281/zenodo.10642381 (DOI)