# Dheeraj Mysore Nagaraj

I am a Research Scientist at Google AI, Bangalore, India in the machine learning and optimization (MLO) team. I work on various topics in theoretical machine learning, applied probability and statistics. My current work focuses on designing learning algorithms for data with temporal dependence, random graphs and stochastic optimization algorithms. I recently completed my PhD at Lab for Information and Decision Systems (LIDS) at MIT advised by Prof. Guy Bresler. Prior to that I was an undergraduate student at IIT Madras.

### Research Areas

Authored Publications

Google Publications

Other Publications

Sort By

Near Optimal Heteroscedastic Regression with Symbiotic Learning

Aniket Das

Dheeraj Baby

Conference on Learning Theory (COLT)(2023)

Preview abstract
We consider regression where the noise distribution depends on the covariates (i.e, heteroscedastic noise), which captures popular settings such as linear regression with multiplicative noise occuring due to covariate uncertainty. In particular we consider linear regression where the noise variance is an unknown rank-1 quadratic function of the covariates. While an application of least squares regression can achieve an error rate of $\nicefrac{d}{n}$, this ignores the fact that the magnitude of the noise can be very small for certain values of the covariates, which can aid faster learning. Our algorithm \ouralg~runs a parameter estimation algorithm and a noise distribution model learning algorithm are run alternately, using each other's outputs to iteratively obtain better estimates of the parameter and the noise distribution model respectively. This achieves an error rate of $\nicefrac{1}{n} + \nicefrac{d^2}{n^2}$, which we show is minimax optimal up to logarithmic factors. A sub-routine for \ouralg~performs phase estimation with multiplicative noise maybe of independent interest.
View details

Indexibility is not Enough for Whittle: Improved Near-Optimal Algorithms for Restless Bandits

Abheek Ghosh

Manish Jain

Autonomous Agents and Multi Agent Systems (AAMAS)(2023)

Preview abstract
We study the problem of planning restless multi-armed bandits (RMABs) with multiple actions. This is a popular model for multi-agent systems with applications like multi-channel communication, monitoring and machine maintenance tasks, and healthcare.
Whittle index policies, which are based on Lagrangian relaxations, are widely used in these settings due to their simplicity and near-optimality under certain conditions. In this work, we first show that Whittle index policies can fail in simple and practically relevant RMAB settings, even when the RMABs are indexable. We further discuss why the Whittle index policies can provably fail in these settings, despite indexability and how even asymptotic optimality does not translate well to practically relevant planning horizons.
We then propose an alternate planning algorithm based on the mean-field method, which borrows ideas from existing research with some improvements. This algorithm can provably and efficiently obtain near-optimal policies when the number of arms, $N$, is large without the stringent structural assumptions required by Whittle index policies. Our approach is hyper-parameter free, and we provide an improved non-asymptotic analysis which has a) a better dependence on problem dependent parameters b) high probability upper bounds which show that the reward of the policy is reliable c) matching lower bounds for this algorithm, thus demonstrating the tightness of our bounds. Our extensive experimental analysis shows that the mean-field approach matches or outperforms other baselines.
View details

No Results Found