Software Open Access

Hickle: a HDF5-based python pickle replacement

Danny Price; Sébastien Celles; Pieter T. Eendebak; Michael M. McKerns; Eben M. Olson; Colin Raffel; Bairen Yi

DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="" xmlns="" xsi:schemaLocation="">
  <identifier identifierType="DOI">10.5281/zenodo.2345649</identifier>
      <creatorName>Danny Price</creatorName>
      <affiliation>Swinburne University of Technology</affiliation>
      <creatorName>Sébastien Celles</creatorName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="">0000-0001-9987-4338</nameIdentifier>
      <creatorName>Pieter T. Eendebak</creatorName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="">0000-0001-7018-1124</nameIdentifier>
      <creatorName>Michael M. McKerns</creatorName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="">0000-0001-8342-3778</nameIdentifier>
      <creatorName>Eben M. Olson</creatorName>
      <creatorName>Colin Raffel</creatorName>
      <creatorName>Bairen Yi</creatorName>
    <title>Hickle: a HDF5-based python pickle replacement</title>
    <subject>data format</subject>
    <date dateType="Issued">2018-12-17</date>
  <resourceType resourceTypeGeneral="Software"/>
    <alternateIdentifier alternateIdentifierType="url"></alternateIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.2345648</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf"></relatedIdentifier>
    <rights rightsURI="">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
    <description descriptionType="Abstract">&lt;p&gt;&lt;code&gt;hickle&lt;/code&gt; is a Python 2/3 package for quickly dumping and loading python data structures to Hierarchical Data Format 5 (HDF5) files. When dumping to HDF5, &lt;code&gt;hickle&lt;/code&gt; automatically convert Python data structures (e.g. lists, dictionaries, &lt;code&gt;numpy&lt;/code&gt; arrays) into HDF5 groups and datasets. When loading from file, &lt;code&gt;hickle&lt;/code&gt; automatically converts data back into its original data type. A key motivation for &lt;code&gt;hickle&lt;/code&gt; is to provide high-performance loading and storage of scientific data in the widely-supported HDF5 format.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;hickle&lt;/code&gt; is designed as a drop-in replacement for the Python &lt;code&gt;pickle&lt;/code&gt; package, which converts Python object hierarchies to and from Python-specific byte streams (processes known as &amp;#39;pickling&amp;#39; and &amp;#39;unpickling&amp;#39; respectively). Several different protocols exist, and files are not designed to be compatible between Python versions, nor interpretable in other languages. In contrast, &lt;code&gt;hickle&lt;/code&gt; stores and loads files from HDF5, for which application programming interfaces (APIs) exist in most major languages, including C, Java, R, and MATLAB.&lt;/p&gt;

&lt;p&gt;Python data structures are mapped into the HDF5 abstract data model in a logical fashion, using the &lt;code&gt;h5py&lt;/code&gt; package. Metadata required to reconstruct the hierarchy of objects, and to allow conversion into Python objects, is stored in HDF5 attributes. Most commonly used Python iterables (dict, tuple, list, set), and data types (int, float, str) are supported, as are &lt;code&gt;numpy&lt;/code&gt; N-dimensional arrays. Commonly-used &lt;code&gt;astropy&lt;/code&gt; data structures and &lt;code&gt;scipy&lt;/code&gt; sparse matrices are also supported.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;hickle&lt;/code&gt; has been used in many scientific research projects, including:&lt;/p&gt;

	&lt;li&gt;Visualization and machine learning on volumetric fluorescence microscopy datasets from histological tissue imaging.&lt;/li&gt;
	&lt;li&gt;Caching pre-computed features for MIDI and audio files for downstream machine learning tasks.&lt;/li&gt;
	&lt;li&gt;Storage and transmission of high volume of shot-gun proteomics data, such as mass spectra of proteins and peptide segments.&lt;/li&gt;
	&lt;li&gt;Storage of astronomical data and calibration data from radio telescopes.&lt;/li&gt;

&lt;p&gt;&lt;code&gt;hickle&lt;/code&gt; is released under the MIT license, and is available from PyPi via &lt;code&gt;pip&lt;/code&gt;; source code is available at &lt;a href=""&gt;;/a&gt;. Note: this text is modified from the hickle Journal for Open-Source Software paper,;/p&gt;</description>
All versions This version
Views 603610
Downloads 1717
Data volume 739.0 kB739.0 kB
Unique views 549556
Unique downloads 1616


Cite as