Marathi Speech Database Standardization: A Review and Work

doi:10.5281/zenodo.5501910

Published September 12, 2021 | Version v1

Journal article Open

Marathi Speech Database Standardization: A Review and Work

1. Research Student, Department of CS & IT, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, Maharashtra, India,
2. Research Student, Department of CS & IT, Dr. Babasaheb AmbedkarMarathwada University, Aurangabad, Maharashtra, India
3. Assistant Professor, Department of CS & IT, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, Maharashtra, India

Abstract---Automatic Speech Recognition System (ASR) is helpful
for interaction between human and machine. It is the way to
operate computer and mobile phones through speech only, without
taking such extra efforts. The term corpus is used for
Standardized Database, which contains a collection of audio
recordings of spoken language with its annotations and
documents. When existing literature was reviewed, it was observed
that much literature is available on how to create speech
databases. But few literatures are available about the
standardization. Such work is done for the languages other than
Indian languages. But for the Hindi, Marathi etc., standardization
for the speech datasets is not up to the mark. The main problem in
designing of a speech database is to deal with variability of speech.
In recent years, there is much need to develop speech corpora for
training and testing materials to be used for wide range of
applications of speech technology like Linguistic Consortium,
Speech interfaces development and language models etc. If it is
standardized in regional languages, it will certainly contribute in
many applications and research. In future, we would like to work
to find standard way to standardized speech databases so with the
help of this we can retrieve data easily and more efficiently.
Keywords- ASR, Corpus, Speech Database, Standardization,
Annotation

Files

10 Paper 01072114 IJCSIS Camera Ready pp92-97.pdf

Files (1.5 MB)

Name	Size	Download all
10 Paper 01072114 IJCSIS Camera Ready pp92-97.pdf md5:680ca1040a41a0655303c2f42f56dfd1	1.5 MB	Preview Download

	All versions	This version
Views	102	101
Downloads	83	82
Data volume	132.7 MB	131.2 MB

Marathi Speech Database Standardization: A Review and Work

Creators

Description

Files

10 Paper 01072114 IJCSIS Camera Ready pp92-97.pdf

Files (1.5 MB)