A Benchmarking on Cloud based Speech-To-Text Services for French Speech and Background Noise Effect

Xu, Binbin; Tao, Chongyang; Feng, Zidu; Raqui, Youssef; Ranwez, Sylvie

doi:10.5281/zenodo.8117301

Published July 2, 2021 | Version v1

Conference paper Open

A Benchmarking on Cloud based Speech-To-Text Services for French Speech and Background Noise Effect

1. EuroMov Digital Health in Motion, Univ Montpellier, IMT Mines Ales
2. DiappyMed

This study presents a large scale benchmarking on cloud based Speech-To-Text systems: {Google Cloud Speech-To-Text}, {Microsoft Azure Cognitive Services}, {Amazon Transcribe}, {IBM Watson Speech to Text}. For each systems, 40158 clean and noisy speech files about 101 hours are tested. Effect of background noise on STT quality is also evaluated with 5 different Signal-to-noise ratios from 40dB to 0dB. Results showed that {Microsoft Azure} provided lowest transcription error rate 9.09% on clean speech, with high robustness to noisy environment. {Google Cloud} and {Amazon Transcribe} gave similar performance, but the latter is very limited for time-constraint usage. Though {IBM Watson} could work correctly in quiet conditions, it is highly sensible to noisy speech which could strongly limit its application in real life situations.

Files

article_stt_apia2021.pdf

Files (954.5 kB)

Name	Size	Download all
article_stt_apia2021.pdf md5:86f05718b3c33c1ae618a2c004ca0dc8	954.5 kB	Preview Download

Additional details

Is identical to: arXiv:2105.03409 (arXiv)

Views

135

Downloads

Show more details

	All versions	This version
Views	75	75
Downloads	135	135
Data volume	133.6 MB	133.6 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

Zenodo

Conference

The 6th National Conference on Practical Applications of Artificial Intelligence (APIA2021), Bordeaux, France, 28 june to 2 july, 2021

Languages

English

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: July 5, 2023
Modified: July 11, 2024

A Benchmarking on Cloud based Speech-To-Text Services for French Speech and Background Noise Effect

Authors/Creators

Description

Files

article_stt_apia2021.pdf

Files (954.5 kB)

Additional details

Related works