A Brief Survey on Emotion Based Text to Speech Conversion System

10.35940/ijsce.A3529.0911121 https://zenodo.org/records/5514958 oai:zenodo.org:5514958 Supriya Dhanaraj Dhumale Supriya Dhanaraj Dhumale Department of Computer Science, Savitribai Phule Pune University, Pune (Maharashtra), India. Manjiri Vitthal Khopade Manjiri Vitthal Khopade Department of Computer Science, Savitribai Phule Pune University, Pune (Maharashtra), India Bhushan Dhimate Bhushan Dhimate Department of Computer Science, Savitribai Phule Pune University, Pune (Maharashtra), India. Avadhoot Yogesh Dhere Avadhoot Yogesh Dhere Department of Computer Science, Savitribai Phule Pune University, Pune (Maharashtra), India. A Brief Survey on Emotion Based Text to Speech Conversion System Zenodo 2021 Emotion recognition, Text to Speech, GRU. Blue Eyes Intelligence Engineering & Sciences Publication (BEIESP) Blue Eyes Intelligence Engineering & Sciences Publication (BEIESP) Publisher 2021-09-30 2021-09-19 eng 2231-2307 Creative Commons Attribution 4.0 International Text to speech conversion is one of the applications of machine learning. It is widely used in search engines, standalone applications, web applications, chatbots and android applications. But still there is need to upgrade text to speech system so that we can get more interactive and user-friendly application. Traditional text to speech application has monotonous voice as output which does not has emotions in it and seems to be more mechanized. So, there is need to improvise the existing system by embedding the flavour of emotions in it. Existing text to speech cannot be used in story telling applications also it does not provide effective communication. Most of the Text to Speech systems are developed using algorithms such as Support Vector Machine (SVM), Naïve Bayes etc. Emotion Based Text to Speech System will help to improvise the existing Text to Speech system. With the help of machine learning and deep learning algorithm such as Recurrent Neural Network can be used for performing sentiment analysis and semantic analysis on the input text. We are going to use neural network which is more effective and help to maintain a relation between previous word and next word. Emotion based text to speech system will be able to identify four emotions ‘happy’, ‘sad’, ‘angry’ and ‘neutral’. Emotion based text to speech system will be beneficial for educational purpose like listening stories from storytelling applications for young budding children. Emotion based text to speech is going to be serviceable for visually impaired individuals.