Published August 7, 2023 | Version v1
Presentation Open

Zarr: Community specification of large, cloud-optimised, N-dimensional, typed array storage

  • 1. Zarr
  • 2. German BioImaging
  • 3. NVIDIA

Description

A key feature of the Python data ecosystem is the reliance on simple but efficient primitives that follow well-defined interfaces to make tools work seamlessly together ( Cf. https://data-apis.org/ ). NumPy provides an in-memory representation for tensors. Dask provides parallelisation of tensor access. Xarray provides metadata linking tensor dimensions. Zarr provides a missing feature, namely the scalable, persistent storage for annotated hierarchies of tensors. Defined through a community process, the Zarr specification enables the storage of large out-of-memory datasets locally and in the cloud. Implementations exist in C++, C, Java, Javascript, Julia, and Python, enabling.

Files

slides.pdf

Files (20.5 MB)

Name Size Download all
md5:cf896f7df5865c194a9d17585b0c3791
20.5 MB Preview Download