Category: Software (Accessibility & Communication)
Difficulty: Intermediate
Time to Build: 4–6 weeks
Prerequisites: Python basics, real-time audio streaming, basic Docker/Server deployment
Deliverables: SRS, architecture diagram, backend scripts, browser extensions, test report, working prototype
Get Project Kit (Free) | Request Instructor Pack | Book 15-min Consult
Table of Contents
Problem Statement & Expected Outcome
Problem Statement
Phone and VoIP calls remain largely inaccessible for people with hearing impairments. Delays in captions or reliance on external devices make real-time communication difficult.
Expected Outcome
A low-latency web and mobile application that listens to live calls, transcribes speech in near real time, and displays accurate captions in Indian languages, providing equal communication access at home, work, or in education.
Abstract
This project provides a free, open-source alternative to the Nagish app, designed specifically for Indian users. It integrates OpenAI’s Whisper model for speech-to-text and supports multi-language transcription, translation, and browser/iOS clients. The kit includes ready-to-run Docker images, browser extensions, and Python server scripts, enabling quick deployment in classrooms, workplaces, and call centers.
Progress Checklist
WhisperLive server deployed (Docker/Native)
- Multilingual model loaded and tested
- Browser/iOS client connected and streaming audio
- Real-time captions displayed and saved
- Translation module verified