Skip to content

LipSeer

Visual speech recognition system enabling communication beyond audio using neural networks and OpenCV.

PythonOpenCVLSTMStreamlitDjango

Overview

A visual speech recognition system that enables communication beyond traditional audio. Uses LSTM neural networks trained on lip movement data and OpenCV pipelines for real-time video processing, making communication accessible for hearing-impaired users.

Key Achievements

  • LSTM-based lip reading model
  • Real-time OpenCV video pipeline
  • Streamlit + Django web interface
  • Accessibility-focused application

Tech Stack

  • Python -- primary language for ML and computer vision pipelines
  • OpenCV -- real-time video capture and lip region detection
  • LSTM -- recurrent architecture for sequential lip movement recognition
  • Streamlit -- interactive demo interface for model inference
  • Django -- backend framework for serving predictions and managing data