{
  "DOI": "10.5281/zenodo.5772165",
  "abstract": "The \"groupby\" or the \"split-apply-combine\" paradigm is ubiquitous in scientific analysis, though it may be named differently e.g. \"binning\", \"histogramming\", \"resampling\", \"compositing\", or \"climatology reductions\". Xarray implements the groupby paradigm through a \"GroupBy\" object. Historically the underlying algorithm is not dask-aware, and tends to fail disastrously with large Pangeo-scale distributed workflows.\u00a0Here I present \"flox\": a new package that explores effective strategies for groupby reductions at scale with dask. Ongoing work will plug this package in to xarray in a backwards-compatible manner, allowing the community to seamlessly benefit from significantly more efficient groupby computations.See\u00a0https://flox.readthedocs.io\u00a0for more.",
  "author": [
    {
      "family": "Cherian",
      "given": "Deepak"
    }
  ],
  "id": "5772165",
  "issued": {
    "date-parts": [
      [
        "2021",
        "11",
        "17"
      ]
    ]
  },
  "publisher": "Zenodo",
  "title": "flox: Fast & furious GroupBy reductions with Dask at Pangeo-scale",
  "type": "speech"
}