Presentation Open Access

flox: Fast & furious GroupBy reductions with Dask at Pangeo-scale

Cherian, Deepak

Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="" xmlns:oai_dc="" xmlns:xsi="" xsi:schemaLocation="">
  <dc:creator>Cherian, Deepak</dc:creator>
  <dc:description>The "groupby" or the "split-apply-combine" paradigm is ubiquitous in scientific analysis, though it may be named differently e.g. "binning", "histogramming", "resampling", "compositing", or "climatology reductions". Xarray implements the groupby paradigm through a "GroupBy" object. Historically the underlying algorithm is not dask-aware, and tends to fail disastrously with large Pangeo-scale distributed workflows. Here I present "flox": a new package that explores effective strategies for groupby reductions at scale with dask. Ongoing work will plug this package in to xarray in a backwards-compatible manner, allowing the community to seamlessly benefit from significantly more efficient groupby computations.See for more.</dc:description>
  <dc:title>flox: Fast &amp; furious GroupBy reductions with Dask at Pangeo-scale</dc:title>
All versions This version
Views 7777
Downloads 2525
Data volume 293.4 MB293.4 MB
Unique views 7070
Unique downloads 2323


Cite as