Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems

doi:10.5281/zenodo.10554945

Published January 23, 2024 | Version v1

Conference paper Open

Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems

1. Indian Institute of Technology Kharagpur
2. UC Irvine
3. CNR-IEIIT
4. Consorzio Nazionale Interuniversitario per le Telecomunicazioni
5. Politecnico di Torino

The increasing pervasiveness of intelligent mobile applications requires to exploit the full range of resources offered by the mobile-edge-cloud network for the execution of inference tasks. However, due to the heterogeneity of such multi-tiered net- works, it is essential to make the applications’ demand amenable to the available resources while minimizing energy consumption. Modern dynamic deep neural networks (DNN) achieve this goal by designing multi-branched architectures where early exits enable sample-based adaptation of the model depth. In this paper, we tackle the problem of allocating sections of DNNs with early exits to the nodes of the mobile-edge-cloud system. By envisioning a 3-stage graph-modeling approach, we represent the possible options for splitting the DNN and deploying the DNN blocks on the multi-tiered network, embedding both the system constraints and the application requirements in a convenient and efficient way. Our framework – named Feasible Inference Graph (FIN) – can identify the solution that minimizes the overall inference energy consumption while enabling distributed inference over the multi-tiered network with the target quality and latency. Our results, obtained for DNNs with different levels of complexity, show that FIN matches the optimum and yields over 65% energy savings relative to a state-of-the-art technique for cost minimization.

Files

Split_Inference_Infocom24.pdf

Files (4.4 MB)

Name	Size	Download all
Split_Inference_Infocom24.pdf md5:0de82de5a36ee55f94db4e65bb9b3bab	4.4 MB	Preview Download

Additional details

PREDICT-6G – PRogrammable AI-Enabled DeterminIstiC neTworking for 6G 101095890: European Commission

	All versions	This version
Views	72	72
Downloads	12	12
Data volume	79.1 MB	79.1 MB

Resource-aware Deployment of Dynamic DNNs over Multi-tiered Interconnected Systems

Creators

Description

Files

Split_Inference_Infocom24.pdf

Files (4.4 MB)

Additional details

Funding