Published January 14, 2025 | Version v4
Model Open

Tailored Design of Audio-Visual Speech Recognition Models using Branchformers

  • 1. ROR icon Universitat Politècnica de València

Description

Official model checkpoints for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers". Checkpoints for our audio-only, video-only, and audio-visual models are available along with their corresponding model configuration files. The same for the LMs used during beam-search inference.

Source code to evaluate our models, fine-tune them, and train new ones for your database of interest can be found in our official GitHub repository.

Files

model_checkpoints.zip

Files (3.1 GB)

Name Size Download all
md5:b5f8215728f938fc0cbb352541a2455d
3.1 GB Preview Download

Additional details

Funding

Generalitat Valenciana
Grant CIACIF/2021/295
Ministerio de Ciencia, Innovación y Universidades
Grant PID2021-124719OB-I00