AI & Data Systems Engineer

Noshitha Juttu

Building applied AI products, LLM workflows, and production-grade data platforms across retrieval, inference optimization, and cloud data infrastructure.

I build at the intersection of AI systems, data infrastructure, and product execution. My background spans production data engineering at Deloitte, applied NLP and on-device model optimization with Adobe x UMass, and agentic AI research at UMass Amherst. I'm focused on turning strong data foundations into reliable AI products - from RAG and multi-agent workflows to scalable pipelines, model evaluation, and inference-ready systems.

Resume

Based in

San Francisco, California

Open to Applied AI, LLM Systems, AI Products, Data Platform, ML/MLOps, and Forward Deployed AI roles.

Core Areas

AI systemsRetrieval pipelinesLLM inference optimizationNLPMulti-agent systemsData infrastructureData Engineering

About

Production data foundations. Applied AI systems. Product-minded execution.

My work started in data science, where I learned how clean data, simple statistical reasoning, and clear interpretation can influence real decisions. I later moved into data and AI engineering at Deloitte, spending nearly 2.5 years building production pipelines, cloud data platforms, and analytics systems across healthcare, energy, and public utility clients. That experience taught me how to integrate messy systems, optimize workflows, and deliver under real business constraints.

At UMass Amherst, I deepened my focus on AI, NLP, and systems research. I have worked on on-device NLP optimization with Adobe x UMass, multi-agent clinical reasoning at the UMass BioNLP Lab, and rapid AI product prototypes across legal AI, retrieval, graph-based reasoning, and edge/cloud inference. Across these projects, my focus is consistent: build AI systems that are measurable, reliable, and useful beyond the demo.

Experience

3+ Years

Research

2 Publications + 1 Under Review

Background

Deloitte · Adobe Research

Rapid Prototypes

3 Builds · Each < 1 Day

Experience

Research, applied AI, and production data systems.

My work spans research labs, model optimization, and large-scale enterprise data platforms — with a focus on systems that are measurable, deployable, and reliable.

AI Researcher

Current

UMass BioNLP Lab

Advisor: Prof. Hong Yu

Sep 2025 – Jan 2026

Built a Reward-Guided Multi-Agentic Clinical NLP System for Alcohol-Use Classification — training-free, inference-time only.

Agentic Clinical NLP System for Alcohol-Use ClassificationRLHF / Reward SignalsExperience Memory RAGAgentic AIClinical NLPAsyncIOSLURM

Hover · tap to see what I built

What I Built

Agentic Clinical NLP System for Alcohol-Use Classification

Built a training-free GRPO-style pipeline that generates multiple LLM reasoning candidates per clinical note, scores them with label-based rewards (Present / Past / None), and improves classification behavior through reflection without parameter updates.
Implemented async LLM rollout pipelines that generate, score, and persist candidate responses per sample, with reward tracking and JSONL experiment logging for reproducibility.
Designed a lightweight experience-memory module using sentence-transformer embeddings to retrieve prior high-reward reasoning traces and surface them in future prompts.
Added controller logic to monitor reward trends across batches, refresh stale experiences, and adjust generation temperature — enabling adaptive inference-time behavior.

Applied AI Extern

Adobe x UMass

Advisors: Prof. Andrew McCallum & Franck Dernoncourt

Jan 2025 – May 2025

Engineered a full on-device NLP inference pipeline — PyTorch → ONNX → CoreML → quantized deployment with BLEU-based quality tracking.

On-Device NLP Optimization for Neural Machine TranslationONNX RuntimeCoreMLPTQ / QATEdge InferenceMarianMTBLEU Evaluation

Hover · tap to see what I built

What I Built

On-Device NLP Optimization for Neural Machine Translation

Implemented encoder-decoder ONNX export and monolithic ONNX workflows for MarianMT, handling sequence-to-sequence dynamic axes, greedy decoding paths, and ONNX Runtime inference validation.
Explored CoreML conversion paths for Apple/mobile deployment alongside vocabulary-reduced ONNX variants, reducing model footprint while maintaining runtime compatibility.
Applied PTQ and QAT quantization strategies to compressed ONNX models, evaluating size-quality tradeoffs across baseline, exported, and quantized variants.
Built BLEU-based evaluation pipelines to systematically benchmark translation quality across PyTorch, ONNX, vocabulary-reduced, and quantized model stages — informing deployment tradeoff decisions.

Consulting Client Work

Deloitte USI — AI & Data Engineering Analyst

Delivered production data engineering, analytics, and ML work across multiple client domains under one Deloitte role.

Period

Sep 2021 – Jan 2024

Client

Power & Utilities — Public Water Utility

Nov 2022 – Jan 2024

Customer Utility Consumption Web Application

Architected and maintained secure ingestion and transformation pipelines for utility billing, consumption, and daily customer activity data powering customer-facing digital experiences at scale.

Delivered 20+ end-to-end data pipelines within a 2-month window, earning a Client Favourite Award for rapid ownership, stakeholder alignment, and on-time delivery.
Supported data powering customer-facing utility experiences for a base of 15M+ users, including leadership-facing dashboards and operational reporting.

Client

Energy, Resources & Industries — Fortune 100 Energy Utility

Apr 2022 – Nov 2022

Data Platform Migration

Led enterprise-scale migration of legacy Informatica BDM workflows to a modern Databricks and PySpark stack, delivering measurable performance gains and earning recognition for client impact.

Reduced batch pipeline runtimes by 25–30% through Databricks migration, incremental processing design, and optimised transformation logic — directly improving downstream SLA compliance for the client.
Received a Spot Award for ownership, delivery pace, and client-facing impact during a high-stakes migration program.

Client

Life Sciences & Healthcare

Nov 2021 – Apr 2022

Healthcare ERP Source Automation to Cloud

Automated ELT pipelines across multi-source healthcare ERP systems to centralise patient and operational data in the cloud, enabling downstream clinical analytics and risk identification.

Built a 3-layer medallion architecture (raw → transformed → reporting) with schema validation, deduplication, and standardised transformations — reducing ingestion errors and improving client reporting trust.
Supported clinical analytics workflows focused on surfacing high-priority and critical-care patient signals from consolidated records across source systems.

Client

Automotive — Used Car Dealers

Sep 2021 – Nov 2021

Used Car Market Pricing Intelligence

Analysed used-car market data to identify price-driving factors across vehicle attributes, market trends, and resale patterns, delivering pricing intelligence to client stakeholders.

Built predictive modelling workflows to estimate used-car pricing, surfacing the most important feature drivers to support client-facing pricing recommendations and market-entry decisions.
Improved model interpretability to give client teams actionable signals — translating ML outputs into a decision-support tool for competitive pricing strategy.

Data Scientist Intern

Innodatatics

Jun 2019 – Aug 2019

Worked on airline churn analysis using statistical testing and classical machine learning to identify customer retention drivers.

Airline Customer Churn Analytics

Hover · tap to see what I built

What I Built

Airline Customer Churn Analytics

Performed EDA and feature engineering on 100K+ airline customer records.
Built and validated a decision-tree churn model with 93.5% accuracy.
Used ANOVA and t-tests to identify significant drivers and translate them into retention recommendations.

Projects

Selected systems and applied research work.

A mix of retrieval, inference, embedded ML, and analytical systems built across coursework, research, and applied engineering work.

Model Card · Legal NLP · LoRA Fine-Tune

SaulLM-7B-AnomalyDetector

LoRA adapter on SaulLM-7B for unfair clause detection in Terms of Service — 4-bit QLoRA, ~0.18% of parameters trained.

LoRASaulLM-7B4-bit QLoRALegal NLPPEFTHuggingFace

View project

Model Card · Legal NLP · LoRA Fine-Tune

TinyLlama-ToS-Finetuned

LoRA-finetuned TinyLlama-1.1B for Fair/Unfair clause classification in ToS agreements — only ~0.1% of parameters trained. Tied to arXiv:2510.22531.

LoRATinyLlama-1.1BPEFTLegal NLPClassificationHuggingFace

View project

Retrieval · NLP Systems

RAG-based Research Copilot

Built modular retrieval and indexing pipelines using LangGraph, Hugging Face, and semantic search to automate literature ingestion, search, and topic discovery.

LangGraphRAGSentence-TransformersSemantic Search

View project

Data Systems · Analytical Ranking

Automated SQL View Generation & Entropy-Based Ranking Engine

Engineered KL-divergence-based ranking, in-memory caching, and pruning to prioritize analytical views, improve throughput, and reduce query runtime from 10s to under 2s.

SQLKL DivergenceSQLiteOptimization

View project

Embedded ML · Real-time Systems

Hand Gesture Controlled UAV / IMU-Based Gesture Recognition

Built a gesture recognition system using ESP32-S3 and IMU sensor data with FFT-based preprocessing for motion-driven control and low-latency command execution.

ESP32-S3IMUFFTEmbedded ML

View project

Hackathons & Rapid Prototypes

Fast builds that explore ideas quickly.

A space for hackathon systems, rapid prototypes, and experimental builds designed and shipped under tight time constraints.

Stanford CodeX Law Hackathon

Flagship Build · Legal AI

< 18 hours

BriefCheck

Theme

Legal AI Verification & Trust Layer

Verification layer for AI-drafted legal briefs — checks if cited cases are real, still good law, on-point, and jurisdiction-matched. Shipped end-to-end in under 18 hours.

Legal AIVerificationRetrievalLLM OrchestrationMCP

View build

Hack With Bay2.0

Prototype · Graph + Agents

<8 hrs

Graph-Native CKD Clinical Reasoning Pipeline

Theme

Clinical Decision Support with Graph + Agents

Graph-based clinical decision support prototype built with Neo4j and RocketRide that structures patient records, KDIGO guideline rules, contraindications, and treatment thresholds into a queryable knowledge graph for agent-driven reasoning.

Neo4jAgentsClinical AIPrototype

View build

Google DeepMind × Cactus AI Hackathon

Hackathon · Inference Routing

Hackathon Build

Hybrid Edge-Cloud Routing for Tool-Calling AI

Theme

Hybrid inference, tool routing, and edge AI systems

Hybrid routing system that decides when FunctionGemma-270M is sufficient for tool-calling and when to escalate to Gemini — optimising for speed, accuracy, and on-device execution trade-offs.

Edge AIGeminiTool RoutingSystems

GitHub Repo

Publications

Papers, preprints, and research outputs.

Research work spanning multi-agent language models, legal NLP, and early deep learning systems.

When Consensus Becomes Compliance: Measuring Sycophancy in Multi-Agent Language Model Interactions

2026

ACL 2026 Student Research Workshop · Under Review

Introduced the Conditional Infection metric to quantify interaction-driven epistemic regression in multi-agent LLM debates.

Text to Trust: Evaluating Fine-Tuning and LoRA Trade-offs in Language Models for Unfair Terms of Service Detection

2025

arXiv preprint (arXiv:2510.22531)

Systematic evaluation of full fine-tuning and parameter-efficient LoRA adaptations for clause-level classification and risk flagging in legal contracts.

Development of an AI-Based Chatbot Using Deep Neural Networks

2021

International Conference on Intelligent Vision and Computing 2021

Speech-enabled chatbot development using Bag of Words, DNNs, and batch gradient descent; recognized for societal impact and integrated into a city municipal website.

Tech Stack

Tools I use to build and evaluate systems.

From model optimization and retrieval to orchestration, warehousing, and infrastructure.

inference

vLLMCUDAONNX RuntimeCoreMLPTQ/QATINT8/FP16

agents

LangChainLangGraphReActMulti-agent orchestrationGRPO/DPO

retrieval

RAGLlamaParseLlamaIndexFAISSMilvusSentence-TransformersSemantic retrievalNeo4j

data

SparkDatabricksAirflowRedshiftAthenaSnowflakeBigQuerydbt

infrastructure

AWSSageMakerDockerKubernetesTerraform

tooling

MCPDBeaverGitHubPythonTypeScript

Contact

Let’s build AI systems that hold up beyond the demo.

I’m open to applied AI, AI systems, retrieval, ML infrastructure, and data platform roles — especially work that sits between research ideas and production systems.

Resume LinkedIn GitHub Email Hugging Face Google Scholar