Published August 7, 2023
| Version v1
Presentation
Open
Zarr: Community specification of large, cloud-optimised, N-dimensional, typed array storage
Description
A key feature of the Python data ecosystem is the reliance on simple but efficient primitives that follow well-defined interfaces to make tools work seamlessly together ( Cf. https://data-apis.org/ ). NumPy provides an in-memory representation for tensors. Dask provides parallelisation of tensor access. Xarray provides metadata linking tensor dimensions. Zarr provides a missing feature, namely the scalable, persistent storage for annotated hierarchies of tensors. Defined through a community process, the Zarr specification enables the storage of large out-of-memory datasets locally and in the cloud. Implementations exist in C++, C, Java, Javascript, Julia, and Python, enabling.
Files
slides.pdf
Files
(20.5 MB)
Name | Size | Download all |
---|---|---|
md5:cf896f7df5865c194a9d17585b0c3791
|
20.5 MB | Preview Download |