Software Engineer with 3+ years of experience building production AI Agent Systems and scalable Distributed Services. Currently working as an AI Software Engineer at University at Buffalo, focusing on Multi-Agent Orchestration, Semantic Search, and reliable distributed services.
I build end-to-end AI systems: MCP-based Agents with LangGraph/LangChain, RAG Pipelines, and full-stack platforms like AutoRSR and X-Voice that combine low-latency user experiences with production-grade reliability.
My experience spans event-driven microservices, Kafka-based Data Pipelines processing hundreds of thousands of daily events, and cloud-native deployments on AWS and Azure with strong observability and CI/CD.
Core Stack: Python, TypeScript, Java, C#, LangGraph, LangChain, RAG, React, Next.js, Spring Boot, .NET, Kafka, PostgreSQL, Redis, Docker, Kubernetes.
MS in Computer Science from University at Buffalo (3.9 GPA), with certifications across AWS, Azure AI, and GCP. Open to Software Engineer, AI/ML Engineer, and Applied AI Engineer roles.
Innovative solutions and cutting-edge implementations
AI-powered weather assistant built with LangGraph and MCP (Model Context Protocol). Natural language queries for real-time weather, forecasts, air quality, and alerts. Deployed on GCP using Vertex AI with Dockerized microservices and network isolation.
AI-powered clinical question-answering system that synthesizes evidence from ClinicalTrials.gov and PubMed. Delivers cited, confidence-scored answers to medical questions using Google Gemini for intelligent query routing and response generation.
AI-powered campaign analytics tool for adtech teams. Natural language chat interface that queries campaign databases, generates LCI attribution reports, compares campaigns, and recommends audience segments.
Enterprise-grade MCP server that gives AI agents a unified interface to Jira, Slack, and SQL databases. Designed for secure, multi-step agentic workflows across enterprise systems.
Deep learning system for predicting accent, age, and gender from voice data using pi-whisper and LoRA fine-tuning. Real-time speech analysis with high accuracy.
AI-powered system for early detection of speech disorders in children using advanced machine learning. Determines whether a child requires Speech-Language Pathologist (SLP) attention with high accuracy.
Real-world contributions to production open source projects
Pull Requests
Pull Requests
Wired unused branch_timeout_seconds and memory_conflict_strategy
configs into the parallel execution path with 6 new tests. Also filed a feature
request for GOAL_ACHIEVED event emission and implemented the solution in a separate PR.
Pull Requests
Implemented full session replay & analytics backend — capturing MCP tool call sequences, token usage, latency, and session timelines for production AI agents.
Technical insights and knowledge sharing
Technology stack and expertise
University at Buffalo - SUNY, New York, USA
GPA: 3.9/4.0
Coursework: Distributed Systems, Operating Systems, Deep Learning, AI/ML, Algorithms & Analysis, Cloud Computing
University at Buffalo, United States
The Research Foundation for SUNY, United States
Muni Health, USA
TCS
Netcore Solutions, India
Educational content from WorkCode & Gaurav YouTube channel
Let's discuss your next project or just say hello!
Buffalo, New York, USA
ksingh.gav@gmail.com
Within 24 hours