Building distributed backend systems and full-stack microservices with Java, Spring Boot, Node.js & React. Prototyping end-to-end ML systems — speech emotion detection, biomarker imaging, deep learning pipelines.
I'm Nischith Adavala, a software engineer and MS Computer Science candidate at the University at Buffalo (GPA: 3.8/4.0), graduating December 2025.
My engineering foundation is in distributed backend systems — at Brane Enterprises I engineered a contract-evaluation engine in Java/Spring Boot achieving sub-20ms latency through AST-based rule parsing and async caching pipelines. At RADcube I built real-time WebSocket microservices across React and Angular for multi-tenant clinical systems.
Beyond the backend I've developed end-to-end ML pipelines: a speech emotion detection system with CNN+LSTM, a 90%-accurate mitosis detection pipeline on gigapixel pathology scans, and an auto annotation platform that cut dataset creation time by 90%+. I also post daily dev content on Daily.dev and Medium.
Full-stack RAG chatbot that ingests PDF documents, stores vector embeddings in Supabase, and answers user queries using Groq-powered Llama models (70B & 8B). Built with a LangGraph state-machine architecture — separate Ingestion and Retrieval graphs handle document parsing and question-answering with real-time SSE streaming. Features multi-chat sessions, model switching, pinned chats, ⌘K search, and chat export. Deployed live on Railway with a CI/CD GitHub Actions pipeline.
Scalable object detection workflow using SSD MobileNet + TensorFlow to auto-annotate image datasets and generate COCO-format bounding box metadata. Reduced manual annotation time by 90%+. Supports batch processing, configurable confidence thresholds, and multi-class filtering — cutting ML iteration cycles dramatically.
XGBoost + Random Forest ensemble predicting T20 cricket innings scores from live match state — team, venue, overs, wickets, current run rate, and last-5-over momentum. Trained on historical IPL + T20I data. Deployed as an interactive Streamlit dashboard with live confidence intervals per over.
End-to-end distributed ML system combining MFCC, Chroma and Mel-Spectrogram feature extraction with CNN + LSTM architectures for temporal modeling of speech signals. Scalable preprocessing pipeline merging 4 benchmark datasets across accents and noise environments. Exposed via AWS Lambda inference API with a live React interface for real-time audio classification at sub-100ms latency.
Unix-like OS kernel in C on x86 — implemented process execution (fork/exec/wait), argument parsing, stack setup, parent–child synchronization via semaphores, and a per-process file descriptor table (0–127). Secure system calls with kernel-level pointer validation. Refactored 5K+ lines; 90%+ test pass rate with GDB and Makefiles.
Research-grade DL pipeline achieving 90% detection accuracy on gigapixel pathology TIFFs using multi-scale tiling strategies. Built an interactive high-resolution OpenSeadragon tile viewer with model prediction overlays — enabling fast visualization of mitotic hotspots with minimal memory footprint for clinical pathologist workflows.
Real-time football player and ball detection using YOLOv8 on live match footage. Tracks players, referees, and ball with team classification via K-Means color clustering on jersey pixels. Estimates player speed and distance covered using optical flow and perspective transformation into real-world coordinates.
Real-time body language classification from webcam using MediaPipe Holistic — extracting 543 facial and body pose landmarks per frame. Exported landmark coordinates as structured feature vectors, trained scikit-learn classifiers (Random Forest, SVM) for gesture and emotion posture recognition at 30 FPS.
Full-stack handwritten digit recognizer with a live HTML5 canvas draw interface. CNN trained on MNIST (98.7% accuracy) served via Flask REST API — returns top-5 class probabilities rendered as animated confidence bars. Preprocessing pipeline handles canvas-to-28×28 normalization in real time.
Regression pipeline predicting calories burned from biometric and workout features — age, BMI, exercise duration, heart rate, body temperature. XGBoost achieves low MAE (<5 kcal). Includes Seaborn correlation heatmaps, feature importance ranking, and an interactive prediction widget.
Hybrid recommendation engine combining content-based filtering on Spotify audio features (danceability, energy, valence, tempo, key) with collaborative SVD matrix factorization. Cosine similarity and KNN for item-to-item lookup. Generates personalized playlists from listening history with cold-start handling.
CNN image classifier for potato leaf disease detection (Early Blight, Late Blight, Healthy) trained on the PlantVillage dataset — achieving 97%+ accuracy. Full-stack deployment: React drag-and-drop frontend uploads leaf photos, FastAPI backend returns diagnosis with confidence scores and treatment recommendations.
Decentralized Electronic Health Records platform on Ethereum smart contracts — patient-controlled access permissions, immutable on-chain audit trails, and AES-encrypted medical records stored on IPFS. Ensures HIPAA-compliant data integrity across mobile cloud e-health systems with role-based key management.
Solidity DeFi banking contract on Ethereum with deposit, withdraw, transfer, and interest accrual. Reentrancy guard, access control via OpenZeppelin, and on-chain event emission for audits. Deployed on Goerli testnet with ethers.js + React frontend for MetaMask wallet interaction and transaction history.
End-to-end deep learning system for classifying skin lesions into 9 diagnostic categories using CNNs and Transfer Learning (VGG16). Trained on the ISIC Kaggle dataset (~2,239 images, balanced to ~5,850 via Augmentor), with an 80/10/10 train-val-test split. Multi-GPU training via TensorFlow Mirrored Strategy. Deployed as a Streamlit web app where users upload dermoscopic images and receive real-time class predictions with confidence scores.
End-to-end audioless speech recognition system that reads lip movements from video to transcribe spoken words. A spatiotemporal CNN processes individual frames for visual features while a Bi-LSTM captures temporal dependencies across the sequence. CTC (Connectionist Temporal Classification) loss enables alignment-free training on variable-length utterances — bridging computer vision and NLP without any audio input.
Automated pet emotion recognition system that classifies facial expressions of cats and dogs into Angry, Happy, Sad, and Other using an EfficientNet CNN backbone. Trained on the Kaggle Pets Facial Expression Dataset (~250 images/class) with an 80/10/10 split and ImageDataGenerator augmentation. Deployed via Streamlit — upload a pet photo and get real-time emotion prediction with confidence score.
End-to-end data analytics project analyzing the All India Consumer Price Index (Urban & Rural, base year 2010) to uncover inflation trends, regional variations, and consumer purchasing power shifts. Data loaded into IBM DB2 and processed in IBM Cognos Analytics — building interactive dashboards, multi-scene data stories, and crosstab reports. Embedded into a Flask web application via Cognos iframe integration for public access.
Posting daily developer content — tech insights, engineering deep dives, and curated reads on distributed systems, ML, and modern backend architectures. Part of a community of 1M+ developers.
Writing in-depth technical articles on distributed systems, backend engineering patterns, machine learning pipelines, and cloud architecture — bridging theory with real production experience.
Breaking down what the "Generative AI Engineer" title really requires — beyond just prompting. The real skills, architecture decisions, and production challenges.
Why inference latency is the most underrated problem in AI deployment — and the real techniques engineers use to fight it in production.
An honest walkthrough of what a production AI pipeline actually involves — from raw data ingestion to model serving and API design.
The case for understanding distributed systems fundamentals before chasing frameworks — and why it makes you a better engineer long-term.
On imposter syndrome, the myth of "readiness," and why feeling uncertain as a new grad is actually a sign you're paying attention.
A ground-up implementation of convolutional neural network operations — forward pass, backprop, padding, and pooling — coded from scratch in NumPy.
Scored 157.22 out of 175 in the Job-A-Thon 13 Hiring Contest by GeeksforGeeks — demonstrating strong competitive programming and algorithmic problem-solving against thousands of candidates nationally in a timed, high-stakes environment.
Software engineer with a strong background in distributed systems, full-stack development and ML engineering. Let's build something great together.