Combining Relevance and Magnitude for Resource-saving DNN Pruning

Chiasserini, Carla Fabiana; Malandrino, Francesco; Molner, Nuria; Zhiqiang, Zhao

doi:10.5281/zenodo.15274434

Published April 24, 2025 | Version v1

Publication Open

Combining Relevance and Magnitude for Resource-saving DNN Pruning

1. Polytechnic University of Turin
2. Politecnico di Torino

Pruning neural networks, i.e., removing some of their parameters whilst retaining their accuracy, is one of the main ways to reduce the latency of a machine learn- ing pipeline, especially in resource- and/or bandwidth-constrained scenarios. In this context, the pruning tech- nique, i.e., how to choose the parameters to remove, is crit- ical to the system performance. In this paper, we propose a novel pruning approach, called FlexRel and predicated upon combining training-time and inference-time infor- mation, namely, parameter magnitude and relevance, in order to improve the resulting accuracy whilst saving both computational resources and bandwidth. Our performance evaluation shows that FlexRel is able to achieve higher pruning factors, saving over 35% bandwidth for typical accuracy targets.

Files

Combining_Relevance_and_Magnitude_for_Resource-saving_DNN_Pruning.pdf

Files (379.7 kB)

Name	Size	Download all
Combining_Relevance_and_Magnitude_for_Resource-saving_DNN_Pruning.pdf md5:d8ac8ec274283f5576b3ef7d830c7b01	379.7 kB	Preview Download

Additional details

Accepted: 2025-04-24

	All versions	This version
Views	68	68
Downloads	42	42
Data volume	19.7 MB	19.7 MB

Combining Relevance and Magnitude for Resource-saving DNN Pruning

Creators

Description

Files

Combining_Relevance_and_Magnitude_for_Resource-saving_DNN_Pruning.pdf

Files (379.7 kB)

Additional details

Dates