Tailored Design of Audio-Visual Speech Recognition Models using Branchformers

Gimeno-Gómez, David; Martínez-Hinarejos, Carlos-D.

doi:10.5281/zenodo.12772516

There is a newer version of the record available.

Published June 3, 2024 | Version v2

Model Open

Tailored Design of Audio-Visual Speech Recognition Models using Branchformers

1. Universitat Politècnica de València

Official model checkpoints for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers". Checkpoints for our audio-only, video-only, and audio-visual models are available along with their corresponding model configuration files.

Source code to evaluate our models, fine-tune them, and train new ones for your database of interest can be found in our official GitHub repository.

Files

model_checkpoints.zip

Files (3.1 GB)

Name	Size	Download all
model_checkpoints.zip md5:a020f20eedfb39f2b524171127c8bae9	3.1 GB	Preview Download

Additional details

Generalitat Valenciana
Grant CIACIF/2021/295
Ministerio de Ciencia, Innovación y Universidades
Grant PID2021-124719OB-I00

202

Views

103

Downloads

Show more details

	All versions	This version
Views	202	17
Downloads	103	6
Data volume	385.4 GB	18.7 GB

More info on how stats are collected....

DOI

Resource type

Model

Publisher

Zenodo

Languages

English, Spanish

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: July 18, 2024
Modified: July 18, 2024

Tailored Design of Audio-Visual Speech Recognition Models using Branchformers

Authors/Creators

Description

Files

model_checkpoints.zip

Files (3.1 GB)

Additional details

Funding