Published March 6, 2026
| Version v1
Journal article
Open
Zentry AI: A Self-Hosted Malayalam Telephony Conversational Agent Using Open- Source Speech and Language Models
Authors/Creators
- 1. Toc H Institute of Science & Technology (TIST)
Description
Educational institutions in Kerala face a recurring
operational burden during admission seasons, when telephonic
enquiries from prospective students surge significantly. These
interactions are predominantly conducted in Malayalam —
a morphologically complex Dravidian language characterised
by regional dialectal variation and pervasive code-mixing with
English, locally termed "Manglish." Existing Interactive Voice
Response (IVR) systems and generic multilingual chatbots are
ill-equipped for such linguistically nuanced interactions, while
cloud-based LLM APIs introduce recurring per-call costs and
student data privacy concerns.
This paper presents Zentry AI, a fully self-hosted, realtime
Malayalam conversational admission assistant designed
for deployment over standard telephony infrastructure. The
system integrates Twilio for SIP call management via TwiML
Bins, a FastAPI/ngrok webhook server, a Malayalam-optimised
Faster-Whisper model for automatic speech recognition (ASR),
IndicTrans2 for bidirectional Malayalam–English neural machine
translation, a Retrieval-Augmented Generation (RAG) framework
powered by Phi-4 Mini (3.8B) via llama.cpp, and a
W8A16 mixed-precision quantised Indic Parler-TTS model for
speech synthesis. A post-call LLM extraction module additionally
structures caller intent and personal details from completed call
transcripts into an administrative database.
Prototype evaluation on consumer-grade hardware (NVIDIA
RTX 3080 class) reveals an end-to-end conversational latency of
approximately 3.5–4.5 seconds, with autoregressive TTS generation
and LLM inference identified as the dominant bottlenecks.
The deployed ASR model achieves a Word Error Rate of 11.49%
with text normalisation on the Common Voice 11.0 Malayalam
evaluation set. These results demonstrate the functional viability
of a privacy-preserving, cost-efficient conversational agent for
regional Indian languages, while honestly mapping the hardware
constraints that remain to be overcome.
Files
zentry-ai-a-self-hosted-malayalam-telephony-conversational-agent-using-open-source-speech-and-langua-IJERTV15IS020753.pdf
Files
(361.7 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:d1cf746eb0a04eef1471cdf16cffc6d2
|
361.7 kB | Preview Download |
Additional details
Related works
- Is identical to
- Journal article: https://www.ijert.org/zentry-ai-a-self-hosted-malayalam-telephony-conversational-agent-using-open--source-speech-and-language-models (URL)