Tango: Low Latency Multi-DNN Inference on Heterogeneous Edge Platforms
Authors/Creators
Description
Running deep neural network (DNN) applications on edge platforms requires low-latency inference. However, scheduling multiple DNN workloads with varying compute and latency needs on resource-constrained edge devices is challenging. This work introduces Tango, a framework that optimizes multi-DNN inference on heterogeneous edge platforms. Using a reinforcement learning agent, Tango balances cluster selection, accuracy configuration, and frequency scaling to minimize latency while maintaining acceptable accuracy. Implemented as portable middleware on Jetson TX, Tango achieves 61% lower latency and 48.4% lower energy consumption, with a maximum accuracy loss of 1.59%, outperforming existing scheduling strategies.
Files
tango_iccd24.pdf
Files
(1.1 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:7e0902b9057f88a2d9c60f14f540c1df
|
1.1 MB | Preview Download |
Additional details
Identifiers
Funding
Dates
- Available
-
2025-02-01