SPEAK: An AI-Based Assistive Video Communication System for Speech and Sign Language Translation
Description
It is still very difficult for the hearing and deaf/hardofhearing (DHH) communities to effectively communicate, especially when it comes to digital video conferencing. Despite the widespread use of platforms like Zoom and Google Meet, they frequently require costly human interpreters or invasive hardware sensors due to their lack of native, real-time bidirectional translation capabilities. In order to close this modality gap, this paper presents SPEAK (Sign Processing Enhanced Audio Kommunicator), a novel sensor-less browser-based platform. By translating spoken language to text captions for DHH users and sign language to text/speech for hearing users, SPEAK enables smooth, two-way communication. By translating spoken language to text captions for DHH users and sign language to text/speech for hearing users, SPEAK enables smooth, two-way communication.
For visual recognition, the system’s architecture makes use of the Detection Transformer (DETR) model with a ResNet-50 backbone.DETR formulates detection as a direct set prediction problem using a bipartite matching loss and self-attention mechanisms, in contrast to conventional CNN-based detectors that rely on region proposals. enhancing robustness against complex backgrounds and doing away with the need for intricate, handcrafted anchors. The audio pipeline simultaneously incorporates Microsoft’s SpeechT5 for natural Text-to-Speech (TTS) synthesis and OpenAI’s Whisper model for high-fidelity Automatic Speech Recognition (ASR). optimized to save bandwidth using Voice Activity Detection (VAD). To guarantee synchronization between video frames and translation outputs, all modules are coordinated within a low-latency WebRTC environment using a Flask-React framework. SPEAK is validated as a scalable, affordable solution for inclusive digital interaction after experimental evaluation on a custom dataset in various lighting conditions shows a sign detection accuracy of 92
Files
81.pdf
Files
(595.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:3b9d23d7b43ff1d160b2d8415336eabf
|
595.0 kB | Preview Download |