Published August 29, 2024 | Version v1.1
Software Open

Artifact for "HiPy: Extracting High-Level Semantics From Python Code For Data Processing"

  • 1. ROR icon Technical University of Munich

Description

This is the artifact to the paper "HiPy: Extracting High-Level Semantics From Python Code For Data Processing" which provides the source code of HiPy as well as tooling, benchmarks, and Docker images to reproduce the experiments shown in the paper.

v1.1: Increase memory limit from 30 GiB to 40 GiB to include some safety margin.

Hardware Dependencies

  • x86-64 machine with recent Linux (tested: Ubuntu 24.04 with Linux kernel 6.8.0), Docker, and make installed.

  • Experiments need ~25 GiB disk space and ~40 GiB main memory.

Getting Started

  1. Change into the paper directory (cd paper)

  2. Extract images from the archive: make load-images

  3. Run all benchmarks once (~20–30 minutes): make run-benchmarks WARMUP_RUNS=0 RUNS=1

  4. Plot the results: make plot.

  5. You will find both the raw results (results.csv) as well as the four figures in the results directory.

Step By Step Instructions

  1. Change into the paper directory (cd paper)

  2. Extract images from the archive:[^1] make load-images

  3. Run all benchmarks 3 times with one warmup run (~1.5–2h): make run-benchmarks WARMUP_RUNS=1 RUNS=3

  4. Plot the results: make plot.

  5. You will find both the raw results (results.csv) as well as the four figures in the results directory.

Reusability Guide

The benchmarking setup can be extended for future research on how to improve runtime of different Python workloads. The setup can be easily extended with both competitor implementations and new benchmarks. To add a new benchmark, a new folder should be created at the benchmarks directory, which can be adapted from the existing benchmarks. Furthermore, the benchmark must be added in the run.py file. The Docker images can be rebuilt with the provided Makefile using make build-images.

In addition, this artifact can also be used for experimenting with the implementation of HiPy (though we plan to open source and further develop HiPy). One can run Python programs with HiPy using the docker container through the Makefile target make run FILE=.... We also included two demo files that print the generated IR:

  1. make run FILE=demo.py executes a simple example that computes the prime numbers below 100.

  2. make run FILE=demo-pandas.py contains a more complex example, also utilizing pandas and numpy.

Help/Options

Please consult the help target of the Makefile located at paper/Makefile for more options.

[^1]: Alternatively, images can be built from source: make build-images

Files

Files (4.0 GB)

Name Size Download all
md5:d48203c894c01e042db9cb5d147bc1c8
4.0 GB Download