Welcome to the Matrix
I'm
AI Software Engineer
Software Engineer with 4+ years shipping secure, high-availability cloud services and AI-driven developer features. Currently an AI Software Engineer at SUNY Buffalo, leading work on Multi-Agent Orchestration, Semantic Search, and production LLM Evaluation.
MCP agents with LangGraph orchestration, RAG pipelines, and LLM eval harnesses that gate production releases.
Event-driven microservices, Kafka pipelines, and Java / .NET platforms on AWS + Azure with on-call DRI rigor.
Contributor to LiteLLM and AWSLabs agent-squad. Peer-reviewed paper in IEEE Xplore on agentic AI.
Innovative solutions and cutting-edge implementations
AI-powered weather assistant built with LangGraph and MCP (Model Context Protocol). Natural language queries for real-time weather, forecasts, air quality, and alerts. Deployed on GCP using Vertex AI with Dockerized microservices and network isolation.
AI-powered clinical question-answering system that synthesizes evidence from ClinicalTrials.gov and PubMed. Delivers cited, confidence-scored answers to medical questions using Google Gemini for intelligent query routing and response generation.
AI-powered campaign analytics tool for adtech teams. Natural language chat interface that queries campaign databases, generates LCI attribution reports, compares campaigns, and recommends audience segments.
Enterprise-grade MCP server that gives AI agents a unified interface to Jira, Slack, and SQL databases. Designed for secure, multi-step agentic workflows across enterprise systems.
Deep learning system for predicting accent, age, and gender from voice data using pi-whisper and LoRA fine-tuning. Real-time speech analysis with high accuracy.
AI-powered system for early detection of speech disorders in children using advanced machine learning. Determines whether a child requires Speech-Language Pathologist (SLP) attention with high accuracy.
Real-world contributions to production open source projects
Pull Requests
Contributed 3 PRs across the supervisor, agents, and chain-execution paths — including a new GeminiAgent that adds Google Gemini model support to AWSLabs' agent-squad framework.
Pull Requests
Shipped PR #21417 — a Redis-backed session registry (34 tests) fixing cross-pod state drift on LiteLLM's MCP gateway routing 100+ LLM APIs. Plus 4 more PRs across proxy streaming, error logging, and config loading.
Pull Requests
Wired unused branch_timeout_seconds and
memory_conflict_strategy
configs into the parallel execution path with 6 new tests. Also filed a feature
request for GOAL_ACHIEVED event emission and implemented the solution in a separate
PR.
Pull Requests
Implemented full session replay & analytics backend — capturing MCP tool call sequences, token usage, latency, and session timelines for production AI agents.
Posts and articles on AI agents, MCP, LLMOps, and the work behind them.
Technology stack and expertise
Five-year career across AI systems, distributed services, and research
University at Buffalo - SUNY, New York, USA
GPA: 3.9 / 4.0
Coursework
University at Buffalo, United States
National AI Institute, USA
The Research Foundation for SUNY, United States
Muni Health, USA
TCS
Netcore Solutions, India
Walkthroughs and explainers — Cloud, Algorithms, and DSA fundamentals.
A scripted shell tour — whoami, education, stack, and what's running in 2026.
Talk AI agents, distributed systems, or whatever you're shipping. The form, email, and LinkedIn all reach me.
Buffalo, NY · USA
Open to relocate · remote-friendly
ksingh.gav@gmail.com
Best for detailed conversations
Usually within 24 hours
Mon–Fri · EST