Published April 17, 2023 | Version v1
Conference paper Open

TransLib: A Library to Explore Transprecision Floating-Point Arithmetic on Multi-Core IoT End-Nodes

  • 1. University of Bologna, Bologna, Italy
  • 2. University of Bologna, Bologna, Italy; ETH, Zurich, Switzerland

Description

Reduced-precision floating-point (FP) arithmetic is being widely adopted to reduce memory footprint and execution time on battery-powered Internet of Things (IoT) end-nodes. However, reduced precision computations must meet end-do-end precision constraints to be acceptable at the application level. This work introduces TransLib 1 1 https://github.com/ahmad-mirsalari/TransLib, an open-source kernel library based on transprecision computing principles, which provides knobs to exploit different FP data types (i.e., float, float16, and bfloat16), also considering the trade-off between homogeneous and mixed-precision solutions. We demonstrate the capabilities of the proposed library on PULP, a 32-bit microcontroller (MCU) coupled with a parallel, programmable accelerator. On average, TransLib kernels achieve an IPC of 0.94 and a speed-up of 1.64× using 16-bit vectorization. The parallel variants achieve a speed-up of 1.97×, 3.91×, and 7.59× on 2, 4, and 8 cores, respectively. The memory footprint reduction is between 25% and 50%. Finally, we show that mixed-precision variants increase the accuracy by 30× at the cost of 2.09× execution time and 1.35× memory footprint compared to float16 vectorized.

Files

TransLib A Library to Explore Transprecision Floating-Point Arithmetic on Multi-Core IoT End-Nodes.pdf

Additional details

Funding

APROPOS – Approximate Computing for Power and Energy Optimisation 956090
European Commission