Published August 4, 2016 | Version v1
Conference paper Open

Convolutional Neural Network for 3D object recognition using volumetric representation

  • 1. Movidius Ltd. 1st Floor, O'Connell Br House, D'Olier St, Dublin, Ireland
  • 2. Trinity College Dublin, College Green, Dublin, Ireland

Description

Following the success of Convolutional Neural Networks (CNNs) on object recognition using 2D images, they are extended in this paper to process 3D data. Nearly most of current systems require huge amount of computation for dealing with large amount of data. In this paper, an efficient 3D volumetric object representation, Volumetric Accelerator (VOLA), is presented which requires much less memory than the normal volumetric representations. On this basis, a few 3D digit datasets using 2D MNIST and 2D digit fonts with different rotations along the x, y, and z axis are introduced. Finally, we introduce a combination of multiple CNN models based on the famous LeNet model. The trained CNN models based on the generated dataset have achieved the average accuracy of 90.30% and 81.85% for 3D-MNIST and 3D-Fonts datasets, respectively. Experimental results show that VOLA-based CNNs perform 1.5x faster than the original LeNet.

Notes

This research was founded by EC H2020-ICT-2014-1 GA: 643924.

Files

SPLINE2016_IEEE-Xiao-CNN3D_1.pdf

Files (13.4 MB)

Name Size Download all
md5:25bf819bbaebe2baa321c5928fa10dc0
13.4 MB Preview Download

Additional details

Funding

European Commission
EoT - Eyes of Things 643924