Indian Version of Nagish App – Real-time Call Captioning

Category: Software (Accessibility & Communication)
Difficulty: Intermediate
Time to Build: 4–6 weeks
Prerequisites: Python basics, real-time audio streaming, basic Docker/Server deployment
Deliverables: SRS, architecture diagram, backend scripts, browser extensions, test report, working prototype

Get Project Kit (Free) | Request Instructor Pack | Book 15-min Consult

Problem Statement & Expected Outcome

Problem Statement
Phone and VoIP calls remain largely inaccessible for people with hearing impairments. Delays in captions or reliance on external devices make real-time communication difficult.

Expected Outcome
A low-latency web and mobile application that listens to live calls, transcribes speech in near real time, and displays accurate captions in Indian languages, providing equal communication access at home, work, or in education.

Abstract

This project provides a free, open-source alternative to the Nagish app, designed specifically for Indian users. It integrates OpenAI’s Whisper model for speech-to-text and supports multi-language transcription, translation, and browser/iOS clients. The kit includes ready-to-run Docker images, browser extensions, and Python server scripts, enabling quick deployment in classrooms, workplaces, and call centers.

Details: User Stories & Acceptance Criteria
  • As a hearing-impaired user, I can receive captions in Hindi, English, or my regional language while speaking on a phone or VoIP call.

  • As a call participant, I can see a live caption feed and save transcripts for future reference.

  • Acceptance: Captions appear within 2–3 seconds of speech; accuracy ≥90 % in quiet environments; transcripts persist across sessions if chosen.

Scope & Modules

Module 1: Audio Capture and Server

Captures audio from phone, VoIP, RTSP/HLS stream, or browser microphone. Supports Faster-Whisper, TensorRT, and OpenVINO backends for speed and hardware flexibility.

Module 2: Real-Time Transcription Engine

Runs WhisperLive server with multilingual model, voice activity detection, and translation threads for Hindi-English or other Indian languages.

Module 3: Client Interfaces

Browser extensions (Chrome/Firefox), iOS app, and a simple web dashboard to view captions.

Module 4: Data Management & Security

Stores session transcripts securely with optional end-to-end encryption and on-device deletion policies.

Module 5: Deployment & Scaling

Docker-based setup for easy deployment on cloud or local servers. Supports GPU acceleration (NVIDIA TensorRT) or CPU-only mode.

Proposed Architecture & Tech Stack
  • Backend: Python FastAPI with WhisperLive server

  • Streaming & Processing: Faster-Whisper / TensorRT / OpenVINO backends

  • Frontend: Browser extensions (JavaScript), iOS app (Swift), and minimal web UI

  • Deployment: Docker containers for GPU, CPU, or Intel iGPU/dGPU with OpenVINO

  • Optional: Celery/Redis for managing heavy concurrent sessions

KPIs & Analytics
  • Caption latency (target <3 s)

  • Transcription accuracy across Indian languages

  • Number of concurrent users supported per server instance

  • Average session time and storage usage

Milestones & Timeline
  • Week 1: Finalize SRS, choose backend (Faster-Whisper / TensorRT / OpenVINO), and design architecture

  • Week 2: Implement audio capture and real-time streaming

  • Week 3: Build browser extension and iOS client

  • Week 4: Integrate translation and session transcript storage

  • Week 5: Optimize latency and test across different devices

  • Week 6: Prepare final documentation, deployment scripts, and demo presentation

Who It’s For
  • Students/Capstone Teams: Ideal for accessibility or IoT communication projects

  • Instructors: Ready-to-run pilot kit with rubrics and evaluation checklists

  • Organizations: Customer care centers, schools, and offices wanting real-time captioning solutions


Progress Checklist

WhisperLive server deployed (Docker/Native)

  • Multilingual model loaded and tested
  • Browser/iOS client connected and streaming audio
  • Real-time captions displayed and saved
  • Translation module verified

Resources & Links

Download Project Kit (ZIP)

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Read More