PROJECTS

Systems I've Built

Production-grade AI systems and automation infrastructure, from real-time voice agents to self-hosted workflow engines.

AI Voice Agent

Multilingual AI Calling Agent

With Efat Anan Shekh

Production-grade multilingual AI voice agent with native Bangla support. Handles live telephony calls with streaming speech processing, multi-turn dialogue, and real-time language switching.

Problem

Businesses in multilingual markets need voice agents that understand local languages. English-only solutions fail, and building native language AI requires specialized speech models.

Key Features

Native Bangla speech recognition & synthesisMultilingual conversation supportReal-time language switchingNatural dialogue management+2 more

Tech Stack

PythonAsteriskWebSocketLLMMultilingual TTS/STT

Outcomes

  • First production Bangla voice agent
  • Multilingual customer experience
  • 24/7 automated call handling
Workflow Automation

Social Media Automation Platform

With Efat Anan Shekh

Enterprise-grade automation system for social media management, AI-powered customer engagement, lead generation, and multi-channel notification orchestration.

Problem

Manual social media operations create bottlenecks, missed leads, and inconsistent customer engagement. Teams spend hours on repetitive tasks instead of strategy.

Key Features

Facebook comments auto-reply with AIMessenger AI assistant integrationLead detection & qualificationTelegram/WhatsApp instant notifications+3 more

Tech Stack

n8nNode.jsPostgreSQLREST APIsWebhooksLLM

Outcomes

  • 80% reduction in manual tasks
  • Zero-delay lead response
  • Unified multi-platform automation
Speech AI

BanglaSTT with BanglaSER

Real-time Bangla speech recognition system with speaker emotion recognition, optimized for telephony environments achieving 90%+ accuracy on live calls.

Problem

Bangla speech recognition lacks telephony-grade accuracy and emotion detection. Existing solutions fail on real-world call audio with noise and compression.

Key Features

90%+ real-time telephony accuracyBangla speech-to-textSpeaker emotion recognitionTelephony noise handling+2 more

Tech Stack

PythonPyTorchWhisperSER ModelsWebSocket

Outcomes

  • Production telephony deployment
  • Bangla language AI advancement
  • Real-time emotion insights
AI Audio

Real-Time Text-to-Speech Reader

Low-latency read-aloud experience that converts selected webpage text into natural, expressive speech with seamless playback and synchronized word highlighting.

Problem

Traditional TTS systems have high latency, unnatural voices, and no visual feedback. Users need real-time, engaging audio experiences.

Key Features

Real-time streaming playbackNatural expressive voice outputLive word highlighting syncSeamless continuous playback+2 more

Tech Stack

JavaScriptWeb Audio APITTSDOM Manipulation

Outcomes

  • Instant audio feedback
  • Enhanced reading experience
  • Accessibility improvement
AI Audio Platform

Twilight Story Teller

End-to-end AI storytelling platform that transforms written stories into expressive, high-quality spoken narration with custom voice creation and style control.

Problem

Creating audiobook-quality narration requires expensive voice actors and studios. Content creators need accessible, customizable voice generation.

Key Features

Story-to-voice generationExpressive delivery control (calm, dramatic, playful)Custom voice creation pathwayDesktop GUI for rapid iteration+2 more

Tech Stack

PythonPyTorchTTS ModelsGradioCLI Tools

Outcomes

  • Professional narration quality
  • Custom voice branding
  • Scalable content production
Desktop AI

Hybrid Multilingual Transcription Studio

GPU-accelerated desktop application for private, offline transcription of meetings and desktop audio with speaker-aware segmentation and multilingual support.

Problem

Cloud transcription tools create privacy, compliance, and reliability risks. Organizations need local-first alternatives that handle noisy, mixed-language conversations.

Key Features

Mic, browser tab & app audio captureSpeaker-aware transcript structuringMultilingual & mixed-language supportNoise & low-volume speech enhancement+2 more

Tech Stack

PythonPyTorchCUDAPyQt6WhisperNeMoFFmpeg

Outcomes

  • Zero cloud dependency
  • Reliable meeting transcription
  • Production-ready desktop build
Desktop AI

Offline Speech Processing System

GPU-accelerated desktop application for offline speech recognition with speaker diarization and neural noise suppression.

Problem

Cloud-based speech processing raises privacy concerns and requires constant connectivity. Organizations need offline solutions.

Key Features

High-accuracy offline recognitionMulti-speaker identificationNoise suppressionWord-level timestamps+2 more

Tech Stack

PythonPyTorchCUDAWhisperPyInstaller

Outcomes

  • Complete data privacy
  • No cloud dependency
  • Production-ready installer
SaaS Platform

RosterBhai - Team Roster Management

With Efat Anan Shekh

Complete roster management system with real-time updates, shift swapping, and team scheduling. Built for efficiency and mobile accessibility.

Problem

Manual scheduling is time-consuming, error-prone, and creates communication gaps. Teams need centralized, real-time roster management.

Key Features

Drag-and-drop schedule builderOne-click shift swap requestsReal-time team notificationsMobile-friendly staff access+2 more

Tech Stack

Next.jsTypeScriptPostgreSQLReal-time Sync

Outcomes

  • 70% faster scheduling
  • Reduced scheduling conflicts
  • Improved team coordination
Healthcare SaaS

DocSaaS - Smart ePrescription

With Efat Anan Shekh

Modern ePrescription platform with built-in medical database, patient analytics dashboard, and practice insights for healthcare providers.

Problem

Paper prescriptions are error-prone and don't provide practice insights. Doctors need digital tools with medical intelligence.

Key Features

Smart prescription builderBuilt-in medicine databasePatient analytics dashboardPractice trend insights+2 more

Tech Stack

Next.jsTypeScriptPostgreSQLAnalytics

Outcomes

  • Faster prescriptions
  • Data-driven practice insights
  • Improved patient tracking

Ready to Automate?

Let's build intelligent systems that transform your operations. From AI voice agents to workflow automation, I deliver production-ready solutions.

Production-Ready Systems
End-to-End Implementation
Scalable Architecture