Published January 2, 2025 | Version v1
Conference paper Open

Tango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms

Description

Running deep neural network (DNN) applications on edge platforms requires low-latency inference. However, scheduling multiple DNN workloads with varying compute and latency needs on resource-constrained edge devices is challenging. This work introduces Tango, a framework that optimizes multi-DNN inference on heterogeneous edge platforms. Using a reinforcement learning agent, Tango balances cluster selection, accuracy configuration, and frequency scaling to minimize latency while maintaining acceptable accuracy. Implemented as portable middleware on Jetson TX, Tango achieves 61% lower latency and 48.4% lower energy consumption, with a maximum accuracy loss of 1.59%, outperforming existing scheduling strategies.

Files

tango_iccd24.pdf

Files (1.1 MB)

Name Size Download all
md5:7e0902b9057f88a2d9c60f14f540c1df
1.1 MB Preview Download

Additional details

Funding

European Commission
APROPOS - Approximate Computing for Power and Energy Optimisation 956090

Dates

Available
2025-02-01