Ritesh Bhalerao

Badlapur · Mahrashtra, India · (+91 9075770945) · ritesh.work.personal@gmail.com

Hey, I'm Ritesh, currently working as a Research Scientist at d_model. I am broadly interested in the mechanistic interpretability of large language models. While I am pretty flexible, my current focus lies in foundational interpretability research, reasoning models, and "model psychology" -- understanding and controlling model behavior. More generally, I aim to develop principled methods that make advanced AI systems more interpretable, reliable, and aligned with human intent.

Apart from this, I enjoy picking up math and physics concepts that catch my eye. I am always looking for opportunities to collaborate with fellow innovators :)

Publications

S-KANformer: Enhancing Transformers for Symbolic Calculations in High Energy Physics

NeurIPS conference 2024 @ ML4PS workshop.

Paper link Poster link

October 2024

Optimizing Biomass Forecasting & Supply Chain : An Integrated Modelling Approach

Advanced Computing, Communications in Computer and Information Science series, Springer.

DOI: 10.1007/978-3-031-56700-1_6

March 2024

Presentation and Conferences

SineKAN: A Flexible Machine Learning Model for Modelling L² Functions and Beyond (Co-author)

APSGlobal Physics Summit 2025, Anaheim, CA.

March 2025

Optimizing Biomass Forecasting & Supply Chain : An Integrated Modelling Approach (Presenter)

International Advanced Computing Conference (IACC 2023), Maharashtra, India.

December 2023

Experience

Research Scientist

d_model

Working towards making AI more interpretable, aligned and intelligent.
Building novel, hard-to-hack, diverse RL environments for frontier AI research.

Feb 2026 – Present

ML Alignment and Theory Scholar (Exploration Phase)

ML Alignment and Theory Scholars (MATS) – Remote

Selected through a highly competitive application process based on a research project in mechanistic interpretability.
Worked under the mentorship of Neel Nanda in the mechanistic interpretability stream.
Worked on a project focusing on studying evaluation awareness in open-source LLMs.

Sept 2025 – Nov 2025

Research Assistant (Google Summer of Code)

Machine Learning for Science (ML4Sci)

Selected for consecutive years to pursue research in machine learning applications for high-energy physics.
Developed advanced Transformer-based models augmented with the Kolmogorov–Arnold Network for symbolic computation of particle interactions.
Acquired hands-on experience with large-scale training on High-Performance Computing clusters, including Perlmutter.
Currently exploring reinforcement learning fine-tuning techniques to enhance model alignment in symbolic physics tasks.
Project link

June 2025 – Nov 2025

May 2024 – Nov 2024

Research Intern

Centre of Excellence in Complex and Non-linear Dynamical Systems - VJTI, Mumbai

Worked on developing a multi-agent system for automatic vulnerability detection in entire code bases.

Feb 2025 – Apr 2025

Summer Research Fellow

Indian Academy of Sciences – University of Hyderabad

Developed a domain specialist chatbot by Retrieval Augmented Fine-Tuning of Llama 2 - Meta’s open-source LLM.
Created an end-to-end domain-agnostic pipeline, including a custom Q&A data generation workflow.
Repository link

May 2024 – Jul 2024

Jr Technical Officer

Computer Society of India – VESIT, Mumbai

Development of council’s mobile application & website.
Conducted technical workshops for 100+ students.

Feb 2023 – Jun 2023

Education

Vivekanand Education Society's Institute of Technology, Mumbai

Bachelor of Engineering

Artificial Intelligence & Data Science

CGPA : 8.47

November 2021 - June 2025

B.A Talreja College, Badlapur

12th Grade (HSC)

Percentage : 94.67

August 2019 - April 2021

Carmel Convent High School, Badlapur

10th Grade (SSC)

Percentage : 93.00

June 2006 - March 2019

Accomplishments

Kaggle Featured Competition – PII Detection from Educational Data

Kaggle

Bronze Medal (Solo Participant)

Achieved a Bronze Medal in the Kaggle Featured Competition on detecting personally identifiable information (PII) in student essays.

certificate link

April 2024

Shell.ai global Hackathon (Agricultural Waste Challenge 2023)

Shell

12th Global Rank.

Our solution for Biomass supply chain optimization for state of Gujarat using ML and MILP was ranked 12th on global leaderboard

certificate link

July 2023 - September 2023

E-Yantra Innovation Challenge

IIT Bombay

Regional Finalists.

EYIC is a national level competition . My team built an app called "Enabled" which is community platform for persons with disabilities. Pitch link

certificate link

November 2022 - March 2023

IIIT Sri City National Healthcare Hackathon - 2023

IIIT Sri City

Second place.

My team's solution - "Rhythm" which is a web and app based integrated platform in the domain of preventive cardio vascular self-care won 2nd prize at this hackathon.

Certificate link Solution link

December 2023 - December 2023

Projects

Evaluation Awareness in Open Source Models

Developed an automated pipeline to generate environments eliciting evaluation awareness in LLMs. Benchmarked multiple black-box and white-box suppression techniques.

Project link

Slides

Does Your LLM Trust You?

Investigated whether LLMs form internal "Trustworthiness" attributes of users and whether these can bypass safety guardrails. Trained linear probes on Llama models to extract trust vectors from synthetic multi-turn conversations. Demonstrated that trust vectors are mechanistically distinct from compliance/refusal directions and successfully induce jailbreaking through a novel mechanism—making the model perceive users as trusted individuals rather than directly suppressing refusal.

Project link

S-KANformer: Enhancing Transformers for Symbolic Calculations in High Energy Physics

Developed and applied S-KANformer (Transformers infused with Kolmogorov-Arnold Networks Using Sinusoidal Activation Functions) for high-energy physics symbolic calculations, achieving state-of-the-art performance.

Project link

Fragile Inceptions

Investigated belief fragility in hybrid reasoning model by fine-tuning Qwen3-1.7B on counterfactual data and evaluating behavioral robustness. Applied mechanistic interpretability using BatchTopKCrossCoder, identifying fine-tune-specific latents linked to incepted counterfactuals.

Project link

LlamaOS: Domain Specific Chatbot for Student Queries

This project develops an open-source chatbot by fine-tuning LLaMA 2 with RAFT, RAG, and a fine-tuned retriever using LLM-generated QnA data.

Project link

Detect PIID information in written essays

Detecting personally identifiable information (PII) in student writing using Longformer & Deberta. Bronze medal in this featured Kaggle competition.

Project link

Small Language Models for ecommerce

Fine-tuned Small Language Models (SLMs) to test their efficacy against LLMs on domain-specific tasks.

Project link

Symbolic Regression using Transformers

Transformer Models for Symbolic Regression trained on a subset of Feynman Dataset. You can find it here.

Project link

Biomass Supply Chain Optimization

Biomass yield forecasting using AutoML and large-scale optimization using Mixed Integer Linear Programming combined with density-based clustering. Solution for the Shell.ai Hackathon 2023.

Project link

LLM - Detect AI Generated Text

Identifying which essay was written by a large language model.

Project link

Enabled - Community platform app for PWDs

Feature packed and completely accessible application for PwDs.

Project link

Courses & Certifications

NLP Specialization

Certificate link

DeepLearning.AI

Neural networks and Deep learning

Certificate link

DeepLearning.AI

Improving deep neural networks

Certificate link

DeepLearning.AI

Structuring Machine learning projects

Certificate link

DeepLearning.AI

Convolutional Neural Networks

Certificate link

DeepLearning.AI

Fundamentals of Deep learning

Certificate link

NVIDIA DLI

Applications of AI for anomaly detection

Certificate link

NVIDIA DLI

Applications of AI for predictive maintenance

Certificate link

NVIDIA DLI

Ritesh Bhalerao

Publications

S-KANformer: Enhancing Transformers for Symbolic Calculations in High Energy Physics

Optimizing Biomass Forecasting & Supply Chain : An Integrated Modelling Approach

Presentation and Conferences

SineKAN: A Flexible Machine Learning Model for Modelling L2 Functions and Beyond (Co-author)

Optimizing Biomass Forecasting & Supply Chain : An Integrated Modelling Approach (Presenter)

Experience

Research Scientist

ML Alignment and Theory Scholar (Exploration Phase)

Research Assistant (Google Summer of Code)

Research Intern

Summer Research Fellow

Jr Technical Officer

Education

Vivekanand Education Society's Institute of Technology, Mumbai

B.A Talreja College, Badlapur

Carmel Convent High School, Badlapur

Accomplishments

Kaggle Featured Competition – PII Detection from Educational Data

Shell.ai global Hackathon (Agricultural Waste Challenge 2023)

E-Yantra Innovation Challenge

IIIT Sri City National Healthcare Hackathon - 2023

Projects

Evaluation Awareness in Open Source Models

Does Your LLM Trust You?

S-KANformer: Enhancing Transformers for Symbolic Calculations in High Energy Physics

Fragile Inceptions

LlamaOS: Domain Specific Chatbot for Student Queries

Detect PIID information in written essays

Small Language Models for ecommerce

Symbolic Regression using Transformers

Biomass Supply Chain Optimization

LLM - Detect AI Generated Text

Enabled - Community platform app for PWDs

Courses & Certifications

NLP Specialization

Neural networks and Deep learning

Improving deep neural networks

Structuring Machine learning projects

Convolutional Neural Networks

Fundamentals of Deep learning

Applications of AI for anomaly detection

Applications of AI for predictive maintenance

SineKAN: A Flexible Machine Learning Model for Modelling L² Functions and Beyond (Co-author)