ASL Recognition - 4-Pipeline ML Architecture
About This Architecture
Four-pipeline American Sign Language recognition system combining data preprocessing, MobileNetV2 transfer learning on Azure ML, real-time MediaPipe hand detection, and Streamlit interface with text-to-speech output. The data pipeline ingests 87K ASL images from Kaggle, applies augmentation and ImageNet normalization, then feeds into a fine-tuned model achieving 98.41% accuracy on 29 sign classes. Real-time inference uses OpenCV webcam capture with MediaPipe keypoint extraction, 10-frame buffering, and phrase assembly via DEL/SPACE commands. The interface layer streams live video, displays recognized letters and words, synthesizes audio via gTTS, and retrieves sign definitions from an English dictionary API. Fork this diagram to customize data sources, swap MobileNetV2 for EfficientNet, integrate alternative TTS engines, or deploy on edge devices.
People also ask
How do you build a real-time American Sign Language recognition system with transfer learning and live video inference?
This diagram shows a four-pipeline approach: Pipeline 1 preprocesses 87K ASL images with augmentation and ImageNet normalization; Pipeline 2 fine-tunes MobileNetV2 on Azure ML GPU to 98.41% accuracy across 29 sign classes; Pipeline 3 captures webcam frames, extracts 21 hand keypoints via MediaPipe, buffers 10 frames, and assembles phrases; Pipeline 4 displays results in Streamlit with live video,
- Domain:
- Ml Pipeline
- Audience:
- Machine learning engineers building real-time computer vision applications with transfer learning
Generated by Diagrams.so — AI architecture diagram generator with native Draw.io output. Fork this diagram, remix it, or download as .drawio, PNG, or SVG.