Published July 2, 2019 | Version v1
Software Open

Visual Haptic beta-VAE

  • 1. Brain & Cognitive Sciences, University of Rochester
  • 2. Department of Computer Science, University of Rochester

Description

This python code implements a system described in the article:

Jacobs, R. A. & Xu, C. (2019). Can multisensory training aid visual
learning?: A computational investigation. Journal of Vision, in press.

The code and the text here will make much more sense if the reader
first reads the article.

As described in the article, we implemented a beta variational autoencoder
(beta-VAE) that received both visual and haptic signals regarding the
shapes of objects. The implementation here is a slight variant of the
implementation described by Louis Tiao in his web post titled "Implementing
Variational Autoencoders in Keras: Beyond the Quickstart Tutorial".

In this code, the haptic data are the GraspIt! joint angles for
each Fribble. Recall that GraspIt! has 16 joints and that each Fribble was
grasped 24 times, meaning that there are 384 values. The dimensionality
of these values was then reduced via PCA to 200 features (accounting
for more than 99% of the variance in the haptic values). Each low
dimensional value has been normalized so that it has a mean of zero
and a variance of one.

The visual data items were created as follows. First, there are two images
of each Fribble, the original image and a flipped (left-right) image. These
images were then presented to VGG16, and we extracted the output of the
convolution base (7 X 7 X 512 = 25088 values). Given the values of the
convolution base for each image of each Fribble (2 imagex X 891 Fribbles),
we then did PCA to reduce the dimensionality to 200 (accounting for more
than 97% of the variance in the convolution base values). Each of the values
in the low-dimension space was then normalized to have a mean of zero and
a variance of one.

For each Fribble, there are 2 data items:
-- original image and haptic data
-- flipped image and haptic data

For each data item, the target labels include both visual and haptic data.

Files

Files (22.1 kB)

Name Size Download all
md5:c5a4335e717948a30ca29c9a44f86110
22.1 kB Download