Published August 15, 2025 | Version 1.0
Report Open

Test-Time Compute for Japanese ASR: A Preliminary Study on Reducing Errors via LLM Post- Editing of Whisper Outputs

Description

We evaluate the effectiveness of test-time compute (TTC)̶post-editing Whisper outputs with a large language model (LLM) without any training̶for Japanese conversational speech. Using four real recordings (13/43/57/14 minutes), TTC is organized in two stages: (A) minimal edits on only low-confidence words with Gemini 2.0 Flash, and (B) OpenAI o3 (hereafter “o3”) proposes error candidates and replacements based on a summary + full transcript prompt.

Files

test-time-compute-for-japanese-asr-a-preliminary-study-on-reducing-errors-via-llm-post-editing-of-whisper-outputs.pdf