Published January 23, 2024 | Version v1
Conference paper Open

Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems

  • 1. ROR icon Indian Institute of Technology Kharagpur
  • 2. UC Irvine
  • 3. CNR-IEIIT
  • 4. ROR icon Consorzio Nazionale Interuniversitario per le Telecomunicazioni
  • 5. Politecnico di Torino

Description

The increasing pervasiveness of intelligent mobile applications requires to exploit the full range of resources offered by the mobile-edge-cloud network for the execution of inference tasks. However, due to the heterogeneity of such multi-tiered net- works, it is essential to make the applications’ demand amenable to the available resources while minimizing energy consumption. Modern dynamic deep neural networks (DNN) achieve this goal by designing multi-branched architectures where early exits enable sample-based adaptation of the model depth. In this paper, we tackle the problem of allocating sections of DNNs with early exits to the nodes of the mobile-edge-cloud system. By envisioning a 3-stage graph-modeling approach, we represent the possible options for splitting the DNN and deploying the DNN blocks on the multi-tiered network, embedding both the system constraints and the application requirements in a convenient and efficient way. Our framework – named Feasible Inference Graph (FIN) – can identify the solution that minimizes the overall inference energy consumption while enabling distributed inference over the multi-tiered network with the target quality and latency. Our results, obtained for DNNs with different levels of complexity, show that FIN matches the optimum and yields over 65% energy savings relative to a state-of-the-art technique for cost minimization.

Files

Split_Inference_Infocom24.pdf

Files (4.4 MB)

Name Size Download all
md5:0de82de5a36ee55f94db4e65bb9b3bab
4.4 MB Preview Download

Additional details

Funding

PREDICT-6G – PRogrammable AI-Enabled DeterminIstiC neTworking for 6G 101095890
European Commission