Efficient Winograd-based Convolution Kernel Implementation on Edge Devices
The implementation of Convolutional Neural Networks on edge Internet of Things (IoT) devices is a significant programming challenge, due to the limited computational resources and the real-time requirements of modern applications. This work focuses on the efficient implementation of the Winograd convolution, based on a set of application-independent and Winograd-specific software techniques for improving the utilization of the edge devices computational resources. The proposed techniques were evaluated in Intel/Movidius Myriad2 platform, using 4 CNNs of various computational requirements. The results show significant performance improvements, up to 54%, over other convolution algorithms.