Finding Storage- and Compute-Efficient Convolutional Neural Networks

Becking, Daniel

doi:10.5281/zenodo.5501152

Published March 9, 2020 | Version v1

Thesis Open

Finding Storage- and Compute-Efficient Convolutional Neural Networks

Becking, Daniel

Contributors

Supervisors:

Convolutional neural networks (CNNs) have taken the spotlight in a variety of machine learning applications. To reach the desired performance, CNNs have become increasingly deeper and larger which goes along with a tremendous amount of power and storage requirements. Such increases in computational demands and memory make deep learning prohibitive for resource-constrained hardware platforms such as mobile devices. In order to address this problem, we provide a general framework which renders efficient CNN representations that solve given classification tasks to specified quality levels. More precisely, the framework yields sparse and ternary neural networks, i.e. networks with many parameters set to zero and the non-zero parameters quantized from 32 bit to 2 bit. Ternary networks are not only efficient in terms of storage but also in terms of computational complexity. By explicitly boosting sparsity we reach further efficiency gains. The proposed framework follows a two-step paradigm. First, a baseline model is extended by compound model scaling until a specified target accuracy is reached. Secondly, our Entropy-Constrained Trained Ternarization (EC2T) algorithm is applied which simultaneously quantizes and sparsifies the scaled model. Here, a \(\lambda\)-operator balances the entropy constraint and thus the compression gain of the resulting network. We validated the effectiveness of EC2T in a variety of experiments. This includes CIFAR-10, CIFAR-100 and ImageNet classification tasks and the compression of renowned architectures (i.e., ResNets and EfficientNet) as well as our own compound-scaled models. We show the advantages of EC2T compared to the standard in ternary quantization, Trained Ternary Quantization (TTQ), and set new benchmarks in this research area. For instance, EC2T compresses ResNet-20 by more than 24x and reduces the number of arithmetical operations by more than 12x while causing a minimal accuracy degradation of 0.9%.

Files

Becking-Finding_Storage_and_Compute_Efficient_Convolutional_Neural_Networks.pdf

Files (1.3 MB)

Name	Size	Download all
Becking-Finding_Storage_and_Compute_Efficient_Convolutional_Neural_Networks.pdf md5:66642e4a68de53adde12a8ef755eabe6	1.3 MB	Preview Download

	All versions	This version
Views	71	71
Downloads	79	79
Data volume	105.7 MB	105.7 MB

Finding Storage- and Compute-Efficient Convolutional Neural Networks

Creators

Contributors

Supervisors:

Description

Files

Becking-Finding_Storage_and_Compute_Efficient_Convolutional_Neural_Networks.pdf

Files (1.3 MB)