Published March 22, 2023 | Version 1.0.0
Other Open

Model for Age and Gender Prediction based on Wav2vec 2.0

  • 1. audEERING GmbH
  • 2. EIHW, University of Augsburg

Description

The model expects a raw audio signal as input and outputs an age score in range of 0...1 (0-100 years) and gender predictions (female, male, child). In addition, it also provides the pooled states of the last transformer layer. The model was created by fine-tuning a pre-trained wav2vec 2.0 model. As foundation we used wav2vec2-large-robust released by Facebook under Apache.2.0. We provide two models: one with all 24 transformer layers and a stripped-down version with six transformer layers. Both models were exported to ONNX format. For training we used aGender, Mozilla Common Voice, Timit and Voxceleb 2. For each database we provide file lists for the splits (train, dev, test) in audformat. The CSV files can be loaded as a pandas.DataFrame with audformat.utils.read_csv(). Further details are given in the associated paper (tba). For an introduction how to use the model, please visit our tutorial project.

Files

splits.zip

Files (1.5 GB)

Name Size Download all
md5:53d5134293f00722278dbf9e57401faf
898.8 kB Preview Download
md5:4b54d14dcd7f0c38f00c05cf72b4cb9a
1.2 GB Preview Download
md5:d1479956f0cf50107198276dc1b4f1d0
332.9 MB Preview Download