Hi, I'm Saurabh Khanal.

I work in Machine Learning and AI

I love building things that make people say, "wait, that's actually useful." Whether it's powered by ML or just some good design, I care about making tech feel human.

About Me

I've always been the kind of person who dives head-first into rabbit holes and doesn't surface until something tangible exists. The habit started with dismantling gadgets (often forgetting how to reassemble them) and evolved into late-night coding sessions, model training marathons, and convincing silicon to do things it was never meant to do.

Somewhere along the way I realised it isn't tech itself that excites me, it's using tech to unknot messy, real-world problems. From helping people find their dream car with plain language, to building voice AI that actually listens, to automating the unglamorous backend plumbing that keeps everything alive - I'm happiest when there's a hard problem up front and genuine impact on the other side.

These days my north star is a research companion, think NotebookLM × Perplexity that lets you debate, dissect, and build upon the papers you read. I've been neck-deep in gems like Scaling LLM Test-Time Compute and Inference Optimization, turning their insights into code. In parallel, I'm brewing an AI-powered car search that feels like chatting with a gear-head friend, plus the infra that keeps everything running fast, smooth, and smart.

When the laptop snaps shut you'll catch me working on startup ideas, shooting jump shots at the nearest court, planning the next passport stamp, or vibing to a steady stream of classic and new-school R&B. I'm obsessed with cultivating deep relationships and blending my first love - math, with my second love of tech into things people actually want. Still figuring it out, but building every day. Based in the DC area 📍

🎧 Check out my favorite R&B playlist — smooth neo-soul, '90s & '00s classics.

Experience

My professional journey and the impact I've made along the way

Machine Learning Engineer

Comcast

Washington DC, US

May 2025 - Current

•Fine-tuned and 4-bit-quantized Mistral-7B LLM for intent detection, improving accuracy ~3 pp while still meeting the 150 ms response target
•Optimized vLLM KV-cache, batch scheduling, and Databricks autoscaling policies, trimming GPU-hour cost 10% and holding P95 latency under 500 ms for 20k+ daily voice interactions
•Built a voice AI agents that pairs a LoRA-tuned Mistral-7B with FAISS search over X1 help docs, cutting unhandled voice requests by ~5%

PyTorch

TensorFlow

CUDA

JAX

FastAPI

LLMs

vLLM

Databricks

Backend Software Engineer

SparkSoft

Washington DC, US

July 2024 - May 2025

•Architected and led improvements to NAAS backend APIs, lifting throughput capacity by 20% while maintaining low-latency performance
•Led migration from legacy chatbot to generative-AI powered chatbot agent utilizing advanced open source LLM's, and boosting user interactions with service by 20%
•Improved Node/Java SpringBoot services code coverage by writing Junits/Jest improving code coverage by 10%
•Engineered CI/CD pipelines with Jenkins to automate blue-green deployments of containerized microservices on AWS, enhancing deployment efficiency and system uptime
•Implemented a Jenkins pipeline to automate the decommissioning of inactive environments in our blue-green deployment strategy, resulting in significant cost savings

Java

SpringBoot

Node.js

Jenkins

AWS

Docker

LLMs

Jr. Full Stack Software Engineer

SparkSoft

Washington DC, US

June 2023 - July 2024

•Implemented and unit-tested SpringBoot REST APIs for Notification-as-a-Service (NAAS) platform, supporting ~10k messages per day and reducing average processing latency by 20%
•Resolved backend service API call failures by analyzing Splunk logs, F5 iRules, Nginx config files, and running curl commands to verify endpoints leading to enhanced system scalability and performance
•Spearheaded integration of OAuth authentication feature into Angular front-end service for seamless user sign up/log in improving UX

Java

SpringBoot

Angular

TypeScript

Nginx

Splunk

Machine Learning Research Intern

Georgetown University

Washington DC, US

May 2021 - Aug 2021

•Engineered vegetation greenness prediction models using PyTorch deep learning libraries
•Developed RNN, DNN, and XGBoost models. Resulted in a 90% correlation with greenness historic data
•Optimized model precision by over 15% by transitioning from random forest to gradient-boosted models and optimizing hyperparameters for improved performance

Python

PyTorch

RNN

DNN

XGBoost