I'm an AI systems engineer focused on making large language models work reliably in production. That means distributed training infrastructure, inference optimisation, and the retrieval architectures that sit underneath LLM applications. I care about the systems layer — the part between research papers and deployed services that rarely gets written about.

My path through this field has been deliberately cross-cutting. I've worked on multi-node GPU training clusters (FSDP, NCCL, Slurm), built high-throughput backend APIs that serve ML predictions under load, and published research on applied deep learning. The thread connecting all of it is a preference for understanding systems end-to-end rather than specialising in a single layer.

Currently I'm building two open-source tools: Setu, a pre-emptive discipline layer for AI coding agents, and RAG Arena, a competitive evaluation framework for retrieval-augmented generation pipelines. I'm also open to fractional AI systems leadership and consulting engagements for teams scaling their ML infrastructure.

I completed my MS in Computer Science at the University of Florida and my BTech in CSE at BML Munjal University. NVIDIA certified in Deep Learning, CUDA, and Generative AI with Diffusion Models.


AI / ML Systems

PyTorchFSDPDDPNCCLSlurmFP16/FP8QuantisationCUDA

LLM / Retrieval

RAGSemantic ScoringChromaDBPineconeWeaviateLLM-as-judge

Backend

HonoFastAPINode.jsTypeScriptRESTMicroservices

Data

PostgreSQLRedisPolarsPandas

Cloud / MLOps

DockerKubernetesGitHub ActionsMLflowAWSGCP

ML Systems Engineer SmartData Lab

2024 – 2025

Redesigned distributed training stack; 3.2× wall-time reduction on multi-node A100 clusters. Led FP8 quantisation rollout and inference pipeline optimisation.

Backend / Infrastructure Lead YVO Service

2023 – 2024

Built and scaled the checkout API from 3k to 12k RPS. Introduced edge caching, async architecture, and ML-backed personalisation that cut cart abandonment by 65%.

Research Engineer University of Florida — SmartData Lab

2022 – 2023

Published peer-reviewed work on deep learning applications in precision agriculture. Submitted to ICLR 2026.


MS, Computer Science

University of Florida

2022 – 2024

BTech, Computer Science & Engineering

BML Munjal University

2018 – 2022

NVIDIA Deep Learning NVIDIA
NVIDIA CUDA Programming NVIDIA
Generative AI with Diffusion Models NVIDIA

[Title withheld — under double-blind review]

ICLR 2026

Under review

Deep learning applications in precision agriculture systems

Computers and Electronics in Agriculture · 2024

Published