Published September 2, 2016 | Version v1
Software Open

histogrammar-python: 1.0.0

  • 1. Princeton University
  • 2. University of Cape Town
  • 3. Open Data Group

Description

Histogrammar is a suite of data aggregation primitives designed for use in parallel processing. In the simplest case, you can use this to compute histograms, but the generality of the primitives allows much more.

See http://histogrammar.org for a complete introduction.

This Python implementation of Histogrammar adheres to version 1.0 of the specification and has been tested to guarantee compatibility with the Scala implementation. The test suite includes empty datasets, NaN/infinity handling, associativity tests, and numerical agreement at the level of one part in a trillion (double precision). Several common histogram types can be plotted in Matplotlib, PyROOT, and Bokeh with a single method call.

If Numpy or Pandas is available, histograms and other aggregators can be filled from arrays ten to a hundred times more quickly via Numpy commands, rather than Python for loops.

If PyROOT is available, histograms and other aggregators can be filled from ROOT TTrees hundreds of times more quickly by JIT-compiling a specialized C++ filler.

Histograms and other aggregators may also be converted into CUDA code for inclusion in a GPU workflow. And if PyCUDA is available, they can also be filled from Numpy arrays by JIT-compiling the CUDA.

Files

histogrammar-python-1.0.0.zip

Files (39.6 MB)

Name Size Download all
md5:6a5d435cc85a4cd227ff1b169254d1ca
39.6 MB Preview Download

Additional details