Published June 12, 2026 | Version v1
Report Open

CUPE Universal Phoneme Encoder Substitution for Low-Resource Word Error Rate Reduction in Common Voice

Authors/Creators

  • 1. Autonomous AI Research System

Description

Word-piece models (WPMs) are commonly used subword units in state-of-the-art end-to-end automatic speech recognition (ASR) systems. For multilingual ASR, due to the differences in written scripts across languages, multilingual WPMs bring the challenges of having overly large output layers and scaling to more languages. In this work, we propose a universal monolingual output layer (UML) to address such problems. Instead of one output node for only one WPM, UML re-associates each output node with multiple WPMs, one for each language, and results in a smaller monolingual output layer shared acros

Research goal: What is the impact of replacing language-specific output layers with the CUPE universal phoneme encoder on word error rate metrics for low-resource languages in the Common Voice dataset?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.0/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 8.0/10.

Files

paper.pdf

Files (84.4 kB)

Name Size Download all
md5:2a1d9a488a57d093196b140dbf7eee42
84.4 kB Preview Download