Presentation Open Access
{ "description": "<p>The "groupby" or the "split-apply-combine" paradigm is ubiquitous in scientific analysis, though it may be named differently e.g. "binning", "histogramming", "resampling", "compositing", or "climatology reductions". Xarray implements the groupby paradigm through a "GroupBy" object. Historically the underlying algorithm is not dask-aware, and tends to fail disastrously with large Pangeo-scale distributed workflows. Here I present "flox": a new package that explores effective strategies for groupby reductions at scale with dask. Ongoing work will plug this package in to xarray in a backwards-compatible manner, allowing the community to seamlessly benefit from significantly more efficient groupby computations.See https://flox.readthedocs.io for more.</p>", "license": "https://creativecommons.org/licenses/by/4.0/legalcode", "creator": [ { "affiliation": "NCAR", "@id": "https://orcid.org/0000-0002-6861-8734", "@type": "Person", "name": "Cherian, Deepak" } ], "url": "https://zenodo.org/record/5772165", "datePublished": "2021-11-17", "keywords": [ "Pangeo", "Xarray" ], "@context": "https://schema.org/", "identifier": "https://doi.org/10.5281/zenodo.5772165", "@id": "https://doi.org/10.5281/zenodo.5772165", "@type": "PresentationDigitalDocument", "name": "flox: Fast & furious GroupBy reductions with Dask at Pangeo-scale" }
All versions | This version | |
---|---|---|
Views | 77 | 77 |
Downloads | 25 | 25 |
Data volume | 293.4 MB | 293.4 MB |
Unique views | 70 | 70 |
Unique downloads | 23 | 23 |