Indian Version of Nagish App – Real-time Call Captioning

Himanshu GargSeptember 17, 20250273 reads

Category: Software (Accessibility & Communication)
Difficulty: Intermediate
Time to Build: 4–6 weeks
Prerequisites: Python basics, real-time audio streaming, basic Docker/Server deployment
Deliverables: SRS, architecture diagram, backend scripts, browser extensions, test report, working prototype

Get Project Kit (Free) | Request Instructor Pack | Book 15-min Consult

Table of Contents

Problem Statement & Expected Outcome

Problem Statement
Phone and VoIP calls remain largely inaccessible for people with hearing impairments. Delays in captions or reliance on external devices make real-time communication difficult.

Expected Outcome
A low-latency web and mobile application that listens to live calls, transcribes speech in near real time, and displays accurate captions in Indian languages, providing equal communication access at home, work, or in education.

Abstract

This project provides a free, open-source alternative to the Nagish app, designed specifically for Indian users. It integrates OpenAI’s Whisper model for speech-to-text and supports multi-language transcription, translation, and browser/iOS clients. The kit includes ready-to-run Docker images, browser extensions, and Python server scripts, enabling quick deployment in classrooms, workplaces, and call centers.

Details: User Stories & Acceptance Criteria

As a hearing-impaired user, I can receive captions in Hindi, English, or my regional language while speaking on a phone or VoIP call.
As a call participant, I can see a live caption feed and save transcripts for future reference.
Acceptance: Captions appear within 2–3 seconds of speech; accuracy ≥90 % in quiet environments; transcripts persist across sessions if chosen.

Scope & Modules

Module 1: Audio Capture and Server

Captures audio from phone, VoIP, RTSP/HLS stream, or browser microphone. Supports Faster-Whisper, TensorRT, and OpenVINO backends for speed and hardware flexibility.

Module 2: Real-Time Transcription Engine

Runs WhisperLive server with multilingual model, voice activity detection, and translation threads for Hindi-English or other Indian languages.

Module 3: Client Interfaces

Browser extensions (Chrome/Firefox), iOS app, and a simple web dashboard to view captions.

Module 4: Data Management & Security

Stores session transcripts securely with optional end-to-end encryption and on-device deletion policies.

Module 5: Deployment & Scaling

Docker-based setup for easy deployment on cloud or local servers. Supports GPU acceleration (NVIDIA TensorRT) or CPU-only mode.

Proposed Architecture & Tech Stack

Backend: Python FastAPI with WhisperLive server
Streaming & Processing: Faster-Whisper / TensorRT / OpenVINO backends
Frontend: Browser extensions (JavaScript), iOS app (Swift), and minimal web UI
Deployment: Docker containers for GPU, CPU, or Intel iGPU/dGPU with OpenVINO
Optional: Celery/Redis for managing heavy concurrent sessions

KPIs & Analytics

Caption latency (target <3 s)
Transcription accuracy across Indian languages
Number of concurrent users supported per server instance
Average session time and storage usage

Milestones & Timeline

Week 1: Finalize SRS, choose backend (Faster-Whisper / TensorRT / OpenVINO), and design architecture
Week 2: Implement audio capture and real-time streaming
Week 3: Build browser extension and iOS client
Week 4: Integrate translation and session transcript storage
Week 5: Optimize latency and test across different devices
Week 6: Prepare final documentation, deployment scripts, and demo presentation

Who It’s For

Students/Capstone Teams: Ideal for accessibility or IoT communication projects
Instructors: Ready-to-run pilot kit with rubrics and evaluation checklists
Organizations: Customer care centers, schools, and offices wanting real-time captioning solutions

Progress Checklist

WhisperLive server deployed (Docker/Native)

Multilingual model loaded and tested
Browser/iOS client connected and streaming audio
Real-time captions displayed and saved
Translation module verified

Resources & Links

Download Project Kit (ZIP)

Problem Statement & Expected Outcome

Abstract

Progress Checklist

Resources & Links

Automatic Road Extraction and Alert Generation for New Roads

Virtual Herbal Garden for AYUSH