Published August 15, 2025
| Version 1.0
Report
Open
Test-Time Compute for Japanese ASR: A Preliminary Study on Reducing Errors via LLM Post- Editing of Whisper Outputs
Creators
Description
We evaluate the effectiveness of test-time compute (TTC)̶post-editing Whisper outputs with a large language model (LLM) without any training̶for Japanese conversational speech. Using four real recordings (13/43/57/14 minutes), TTC is organized in two stages: (A) minimal edits on only low-confidence words with Gemini 2.0 Flash, and (B) OpenAI o3 (hereafter “o3”) proposes error candidates and replacements based on a summary + full transcript prompt.
Files
test-time-compute-for-japanese-asr-a-preliminary-study-on-reducing-errors-via-llm-post-editing-of-whisper-outputs.pdf
Files
(225.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:b70525e3a9ab31e5e9b5f3f19c78861c
|
225.8 kB | Preview Download |