Published March 6, 2026 | Version v1
Journal article Open

Zentry AI: A Self-Hosted Malayalam Telephony Conversational Agent Using Open- Source Speech and Language Models

  • 1. Toc H Institute of Science & Technology (TIST)

Description

Educational institutions in Kerala face a recurring operational burden during admission seasons, when telephonic enquiries from prospective students surge significantly. These interactions are predominantly conducted in Malayalam — a morphologically complex Dravidian language characterised by regional dialectal variation and pervasive code-mixing with English, locally termed "Manglish." Existing Interactive Voice Response (IVR) systems and generic multilingual chatbots are ill-equipped for such linguistically nuanced interactions, while cloud-based LLM APIs introduce recurring per-call costs and student data privacy concerns. This paper presents Zentry AI, a fully self-hosted, realtime Malayalam conversational admission assistant designed for deployment over standard telephony infrastructure. The system integrates Twilio for SIP call management via TwiML Bins, a FastAPI/ngrok webhook server, a Malayalam-optimised Faster-Whisper model for automatic speech recognition (ASR), IndicTrans2 for bidirectional Malayalam–English neural machine translation, a Retrieval-Augmented Generation (RAG) framework powered by Phi-4 Mini (3.8B) via llama.cpp, and a W8A16 mixed-precision quantised Indic Parler-TTS model for speech synthesis. A post-call LLM extraction module additionally structures caller intent and personal details from completed call transcripts into an administrative database. Prototype evaluation on consumer-grade hardware (NVIDIA RTX 3080 class) reveals an end-to-end conversational latency of approximately 3.5–4.5 seconds, with autoregressive TTS generation and LLM inference identified as the dominant bottlenecks. The deployed ASR model achieves a Word Error Rate of 11.49% with text normalisation on the Common Voice 11.0 Malayalam evaluation set. These results demonstrate the functional viability of a privacy-preserving, cost-efficient conversational agent for regional Indian languages, while honestly mapping the hardware constraints that remain to be overcome.

Files

zentry-ai-a-self-hosted-malayalam-telephony-conversational-agent-using-open-source-speech-and-langua-IJERTV15IS020753.pdf

Additional details