Test-Time Compute for Japanese ASR: A Preliminary Study on Reducing Errors via LLM Post- Editing of Whisper Outputs

Ko, Ohashi

doi:10.5281/zenodo.16881446

Published August 15, 2025 | Version 1.0

Report Open

Test-Time Compute for Japanese ASR: A Preliminary Study on Reducing Errors via LLM Post- Editing of Whisper Outputs

Ko, Ohashi (Researcher)

We evaluate the effectiveness of test-time compute (TTC)̶post-editing Whisper outputs with a large language model (LLM) without any training̶for Japanese conversational speech. Using four real recordings (13/43/57/14 minutes), TTC is organized in two stages: (A) minimal edits on only low-confidence words with Gemini 2.0 Flash, and (B) OpenAI o3 (hereafter “o3”) proposes error candidates and replacements based on a summary + full transcript prompt.

Files

test-time-compute-for-japanese-asr-a-preliminary-study-on-reducing-errors-via-llm-post-editing-of-whisper-outputs.pdf

Files (225.8 kB)

Name	Size	Download all
test-time-compute-for-japanese-asr-a-preliminary-study-on-reducing-errors-via-llm-post-editing-of-whisper-outputs.pdf md5:b70525e3a9ab31e5e9b5f3f19c78861c	225.8 kB	Preview Download

107

Views

Downloads

Show more details

	All versions	This version
Views	107	107
Downloads	49	49
Data volume	13.5 MB	13.5 MB

More info on how stats are collected....

DOI

Resource type

Report

Publisher

Zenodo

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: August 15, 2025
Modified: August 15, 2025

Test-Time Compute for Japanese ASR: A Preliminary Study on Reducing Errors via LLM Post- Editing of Whisper Outputs

Authors/Creators

Description

Files

test-time-compute-for-japanese-asr-a-preliminary-study-on-reducing-errors-via-llm-post-editing-of-whisper-outputs.pdf

Files (225.8 kB)