Jump to Content
Milind  Tambe

Milind Tambe

Milind Tambe is Principal Scientist and Director of "AI for Social Good" at Google Research; concurrently, he is also Gordon McKay Professor of Computer Science and Director of Center for Research in Computation and Society at Harvard University. Dr. Tambe's research focuses on advancing AI and multiagent systems research for social impact. He is recipient of the IJCAI (International Joint Conference on AI) John McCarthy Award, AAAI (Association for Advancement of Artificial Intelligence) Feigenbaum prize, ACM/SIGAI Autonomous Agents Research Award from AAMAS (Autonomous Agents and Multiagent Systems Conference), AAAI Robert S Engelmore Memorial Lecture award, INFORMS Wagner prize, the MORS (Military Operations Research Society) Rist Prize. He is a fellow of AAAI and ACM. For his work in AI and public safety, he has received the Columbus Foundation Homeland Security Award, and meritorious Team Commendation from the US Coast Guard and LA Airport Police, and Certificate of Appreciation from US Federal Air Marshals Service for pioneering real-world deployments of security games. Prof. Tambe's papers have received either best paper awards or best paper finalist recognition 30 times at conferences such as AAAI, AAMAS, IJCAI and others. Prof. Tambe received his Ph.D. from the School of Computer Science at Carnegie Mellon University.

Research Areas

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract This paper studies restless multi-armed bandit (RMAB) problems with unknown arm transition dynamics but with known correlated arm features. The goal is to learn a model to predict transition dynamics given features, where the Whittle index policy solves the RMAB problems using predicted transitions. However, prior works often learn the model by maximizing the predictive accuracy instead of final RMAB solution quality, causing a mismatch between training and evaluation objectives. To address this shortcoming, we propose a novel approach for decision-focused learning in RMAB that directly trains the predictive model to maximize the Whittle index solution quality. We present three key contributions: (i) we establish differentiability of the Whittle index policy to support decision-focused learning; (ii) we significantly improve the scalability of decision-focused learning approaches in sequential problems, specifically RMAB problems; (iii) we apply our algorithm to a previously collected dataset of maternal and child health to demonstrate its performance. Indeed, our algorithm is the first for decision-focused learning in RMAB that scales to real-world problem sizes. View details
    Preview abstract We consider the task of effect estimation of resource allocation algorithms through clinical trials. Such algorithms are tasked with optimally utilizing severely limited intervention resources, with the goal of maximizing their overall benefits derived. Evaluation of such algorithms through clinical trials proves difficult, notwithstanding the scale of the trial, because the agents’ outcomes are inextricably linked through the budget constraint controlling the intervention decisions. Towards building more powerful estimators with improved statistical significance estimates, we propose a novel concept involving retrospective reshuffling of participants across experimental arms at the end of a clinical trial. We identify conditions under which such reassignments are permissible and can be leveraged to construct counterfactual clinical trials, whose outcomes can be accurately ‘observed’ without uncertainty, for free. We prove theoretically that such an estimator is more accurate than common estimators based on sample means — we show that it returns an unbiased estimate and simultaneously reduces variance. We demonstrate the value of our approach through empirical experiments on both, real case studies as well as synthetic and realistic data sets and show improved estimation accuracy across the board. View details
    Analyzing and Predicting Low-Listenership Trends in a Large-Scale Mobile Health Program: A Preliminary Investigation
    Shresth Verma
    Kumar Madhu Sudan
    Amrita Mahale
    Aparna Hegde
    The Workshop in Data Science for Social Good, KDD 2023 (2023)
    Preview abstract Mobile health programs are becoming an increasingly popular medium for dissemination of health information among beneficiaries in less privileged communities. Kilkari is one of the world’s largest mobile health programs which delivers time sensitive audio-messages to pregnant women and new mothers. We have been collaborating with ARMMAN, a non-profit in India which operates the Kilkari program, to identify bottlenecks to improve the efficiency of the program. In particular, we provide an initial analysis of the trajectories of benefi- ciaries’ interaction with the mHealth program and examine elements of the program that can be potentially enhanced to boost its success. We cluster the cohort into different buckets based on listenership so as to analyze listenership patterns for each group that could help boost program success . We also demonstrate preliminary results on using historical data in a time-series prediction to identify benefi- ciary dropouts and enable NGOs in devising timely interventions to strengthen beneficiary retention. View details
    Robust Planning over Restless Groups: Engagement Interventions for a Large-Scale Maternal Telehealth Program
    Jackson Killian
    Lily Xu
    Arpita Biswas
    Shresth Verma
    Vineet Nair
    Aparna Hegde
    Neha Madhiwalla
    Paula Rodriguez Diaz
    Sonja Johnson-Yu
    AAAI 2023 (to appear)
    Preview abstract In 2020, maternal mortality in India was estimated to be as high as 130 deaths per 100K live births, nearly twice the UN’s target. To improve health outcomes, the non-profit ARMMAN sends automated voice messages to expecting and new mothers across India. However, 38% of mothers stop listening to these calls, missing critical preventative care information. To improve engagement, ARMMAN employs health workers to intervene by making service calls, but workers can only call a fraction of the 100K enrolled mothers. Partnering with ARMMAN, we model the problem of allocating limited interventions across mothers as a restless multi-armed bandit (RMAB), where the realities of large scale and model uncertainty present key new technical challenges. We address these with GROUPS, a double oracle–based algorithm for robust planning in RMABs with scalable grouped arms. Robustness over grouped arms requires several methodological advances. First, to adversarially select stochastic group dynamics, we develop a new method to optimize Whittle indices over transition probability intervals. Second, to learn group level RMAB policy best responses to these adversarial environments, we introduce a weighted index heuristic. Third, we prove a key theoretical result that planning over grouped arms achieves the same minimax regret–optimal strategy as planning over individual arms, under a technical condition. Finally, using real world data from ARMMAN, we show that GROUPS produces robust policies that reduce minimax regret by up to 50%, halving the number of preventable missed voice messages to connect more mothers with life saving maternal health information. View details
    Adherence Bandits
    Jackson A. Killian*
    Aditya Mate*
    Manish Jain
    The Workshop on Artificial Intelligence for Social Good at AAAI 2023 (2023)
    Preview abstract We define a new subclass of the restless multi-armed bandit framework, that we name Adherence Bandits, designed to capture the dynamics prevalent in many public health intervention problems. We discuss key properties of Adherence Bandits, their real-world motivations, how structures lead to both technical and computational advantages, and natural extensions that have been or can be made to the subclass. We summarise key research works that have contributed to the growing sub-area and finish by highlighting future directions of research View details
    Preview abstract Restless multi-armed bandits (RMABs) are an extension of multi-armed bandits (MABs) with state information associated with arms, where the states evolve restlessly with different transition probabilities depending on whether the arms are pulled. The additional state information in RMABs captures broader applications with state dependency, including digital marketing and healthcare recommendation. However, solving RMABs requires information on transition dynamics, which is often not available upfront. This paper considers learning the transition probabilities in an RMAB setting while maintaining small regret. We use the confidence bounds of transition probabilities to define an optimistic Whittle index policy to solve the RMAB problem while maintaining sub-linear regret compared to the benchmark. Our algorithm, UCWhittle, leverages the structure of RMABs and the Whittle index policy solution to achieve better performance than other online learning baselines without structural information. We evaluate UCWhittle on real-world healthcare data to help reduce maternal mortality. View details
    Preview abstract We study the problem of planning restless multi-armed bandits (RMABs) with multiple actions. This is a popular model for multi-agent systems with applications like multi-channel communication, monitoring and machine maintenance tasks, and healthcare. Whittle index policies, which are based on Lagrangian relaxations, are widely used in these settings due to their simplicity and near-optimality under certain conditions. In this work, we first show that Whittle index policies can fail in simple and practically relevant RMAB settings, even when the RMABs are indexable. We further discuss why the Whittle index policies can provably fail in these settings, despite indexability and how even asymptotic optimality does not translate well to practically relevant planning horizons. We then propose an alternate planning algorithm based on the mean-field method, which borrows ideas from existing research with some improvements. This algorithm can provably and efficiently obtain near-optimal policies when the number of arms, $N$, is large without the stringent structural assumptions required by Whittle index policies. Our approach is hyper-parameter free, and we provide an improved non-asymptotic analysis which has a) a better dependence on problem dependent parameters b) high probability upper bounds which show that the reward of the policy is reliable c) matching lower bounds for this algorithm, thus demonstrating the tightness of our bounds. Our extensive experimental analysis shows that the mean-field approach matches or outperforms other baselines. View details
    Deployed SAHELI: Field Optimization of Intelligent RMAB for Maternal and Child Care
    Shresth Verma
    Aditya S. Mate
    Paritosh Verma
    Sruthi Gorantala
    Neha Madhiwalla
    Aparna Hegde
    Manish Jain
    Innovative Applications of Artificial Intelligence (IAAI) (2023) (to appear)
    Preview abstract Underserved communities face critical health challenges due to lack of access to timely and reliable information. Non-governmental organizations are leveraging the widespread use of cellphones to combat these healthcare challenges and spread preventative awareness. The health workers at these organizations reach out individually to beneficiaries; however such programs still suffer from declining engagement. We have deployed SAHELI, a system to efficiently utilize the limited availability of health workers for improving maternal and child health in India. SAHELI uses the Restless Multi-armed Bandit (RMAB) framework to identify beneficiaries for outreach. It is the first deployed application for RMABs in public health, and is already in continuous use by our partner NGO, ARMMAN. We have already reached ∼ 100K beneficiaries with SAHELI, and are on track to serve 1 million beneficiaries by the end of 2023. This scale and impact has been achieved through multiple innovations in the RMAB model and its development, in preparation of real world data, and in deployment practices; and through careful consideration of responsible AI practices. Specifically, in this paper, we describe our approach to learn from past data to improve the performance of SAHELI’s RMAB model, the real-world challenges faced during deployment and adoption of SAHELI, and the end-to-end pipeline View details
    Preview abstract Restless Multi-Armed Bandits (RMABs) are an important model that enable optimizing allocation of limited resources in sequential decision-making settings. Typical RMABs assume the budget --- the number of arms pulled --- per round to be fixed for each step in the planning horizon. However, when planning in real-world settings, resources are not necessarily limited at each planning step; we may be able to distribute surplus resources in one round to an earlier or later round. Often this flexibility in budget is constrained to within a subset of consecutive planning steps. In this paper we define a general class of RMABs with flexible budget, which we term F-RMABs, and provide an algorithm to optimally solve for them. Additionally, we provide heuristics that tradeoff solution quality for efficiency and present experimental comparisons of different F-RMAB solution approaches. View details
    Facilitating Human-Wildlife Cohabitation through Conflict Prediction
    Susobhan Ghosh
    Pradeep Varakantham
    Aniket Bhatkhande
    Tamanna Ahmad
    Anish Andheria
    Wenjun Li
    IAAI Technical Track on Emerging Applications of AI (2022)
    Preview abstract With increasing world population and expanded use of forests as cohabited regions, interactions and conflicts with wildlife are increasing, leading to large scale loss of lives (animal and human) and livelihoods (economic). While community knowledge is valuable, forest officials and conservation organisations can greatly benefit from predictive analysis of human-wildlife conflict, leading to targeted interventions that can potentially help save lives and livelihoods. However, the problem of prediction is a complex socio-technical problem in the context of limited data in low-resource regions. Identifying the right features to make accurate predictions of conflicts at the required spatial granularity using a sparse conflict training dataset is the key challenge that we address in this paper. Specifically, we do an illustrative case study on human-wildlife conflicts in the Bramhapuri Forest Division in Chandrapur, Maharashtra, India. Most existing work has considered human wildlife conflicts in protected areas and to the best of our knowledge, this is the first effort at prediction of human-wildlife conflicts in unprotected areas and using those predictions for deploying interventions on the ground. View details
    ADVISER: AI-Driven Vaccination Intervention Optimiser for Increasing Vaccine Uptake in Nigeria
    Vineet Nair
    Kritika Prakash
    Michael Wilbur
    Corinne Namblard
    Oyindamola Adeyemo
    Abhishek Dubey
    Abiodun Adereni
    Ayan Mukhopadhyay
    IJCAI ' 22 Social Good Track (2022)
    Preview abstract More than 5 million children under five years die from largely preventable or treatable medical conditions every year, with an overwhelmingly large proportion of deaths occurring in under-developed countries with low vaccination uptake. One of the United Nations’ sustainable development goals (SDG 3) aims to end preventable deaths of newborns and children under five years of age. We focus on Nigeria, where the rate of infant mortality is appalling. We collaborate with HelpMum, a large non-profit organization in Nigeria to design and optimize the allocation of heterogeneous health interventions under uncertainty to increase vaccination uptake, the first such collaboration in Nigeria. Our framework, ADVISER: AI-Driven Vaccination Intervention Optimiser, is based on an integer linear program that seeks to maximize the cumulative probability of successful vaccination. Our optimization formulation is intractable in practice. We present a heuristic approach that enables us to solve the problem for real-world use-cases. We also present theoretical bounds for the heuristic method. Finally, we show that the proposed approach outperforms baseline methods in terms of vaccination uptake through experimental evaluation. HelpMum is currently planning a pilot program based on our approach to be deployed in the largest city of Nigeria, which would be the first deployment of an AI driven vaccination uptake program in the country and hopefully, pave the way for other data-driven programs to improve health outcomes in Nigeria. View details
    Case Study: Applying Decision Focused Learning in the Real World
    Aditya Mayte
    Kai Wang
    Shresth Verma
    NeurIPS Workshop on Trustworthy and Socially Responsible Machine Learning (2022) (to appear)
    Preview abstract Many real world optimization problems with underlying unknown model parameters are solved using the predict-then-optimize framework. In particular, a model is learnt to first predict the parameters of the optimization problem, which is subsequently solved using an optimization algorithm. However, this approach maximises for the predictive accuracy rather than the quality of the final solution. Decision Focused Learning (DFL) solves this objective mismatch by integrating the optimization problem in the learning pipeline. Previous works have only shown the applicability of DFL in simulation settings. In our work, we consider the optimization problem of scheduling limited live service calls in Maternal and Child Health Awareness Programs and model it using Restless Multi-Armed Bandits (RMAB). In collaboration with an NGO, we conduct a large-scale field study consisting of 9000 beneficiaries for 6 weeks and track key engagement metrics in a mobile health awareness program. To the best of our knowledge this is the first real world study involving Decision Focused Learning. We demonstrate that beneficiaries in the DFL group experience statistically significant reductions in cumulative engagement drop, while those in the Predict-then-Optimize group do not. This establishes the practicality of use of decision focused learning for real world problems. We also demonstrate that DFL learns a better decision boundary between the RMAB actions, and strategically predicts parameters which contribute most to the final decision outcome. View details
    Field Study in Deploying Restless Multi-Armed Bandits: Assisting Non-Profits in Improving Maternal and Child Health
    Aditya Mate
    Lovish Madaan
    Neha Madhiwalla
    Shresth Verma
    Aparna Hegde
    Pradeep Varakantham
    AAAI Conference on Artificial Intelligence (2022) (to appear)
    Preview abstract The widespread availability of cell phones has enabled non-profits to deliver critical health information to their beneficiaries in a timely manner. This paper describes our work to assist non-profits that employ automated messaging programs to deliver timely preventive care information to beneficiaries (new and expecting mothers) during pregnancy and after delivery. Unfortunately, a key challenge in such information delivery programs is that a significant fraction of beneficiaries drop out of the program. Yet, non-profits often have limited health-worker resources (time) to place crucial service calls for live interaction with beneficiaries to prevent such engagement drops. To assist non-profits in optimizing this limited resource, we developed a Restless Multi-Armed Bandits (RMABs) system. One key technical contribution in this system is a novel clustering method of offline historical data to infer unknown RMAB parameters. Our second major contribution is evaluation of our RMAB system in collaboration with an NGO, via a real-world service quality improvement study. The study compared strategies for optimizing service calls to 23003 participants over a period of 7 weeks to reduce engagement drops. We show that the RMAB group provides statistically significant improvement over other comparison groups, reducing 30% engagement drops. To the best of our knowledge, this is the first study demonstrating the utility of RMABs in real world public health settings. We are transitioning our RMAB system to the NGO for real-world use. View details
    Measuring Data Collection Diligence for Community Healthcare
    Ramesha Karunasena
    Md Sarfrazul Ambiya
    Arunesh Sinha
    Ruchit Nagar
    Dhyanesh Narayanan
    ACM conference on Equity and Access in Algorithms, Mechanisms, and Optimization 2021 (2021)
    Preview abstract Data analytics has tremendous potential to provide targeted benefit in low-resource communities, however the availability of highquality public health data is a significant challenge in developing countries primarily due to non-diligent data collection by community health workers (CHWs). Our use of the word non-diligence here is to emphasize that poor data collection is often not a deliberate action by CHW but arises due to a myriad of factors, sometime beyond the control of the CHW. In this work, we define and test a data collection diligence score. This challenging unlabeled data problem is handled by building upon domain expert’s guidance to design a useful data representation of the raw data, using which we design a simple and natural score. An important aspect of the score is relative scoring of the CHWs, which implicitly takes into account the context of the local area. The data is also clustered and interpreting these clusters provides a natural explanation of the past behavior of each data collector. We further predict the diligence score for future time steps. Our framework has been validated on the ground using observations by the field monitors of our partner NGO in India. Beyond the successful field test, our work is in the final stages of deployment in the state of Rajasthan, India. This system will be helpful in providing non-punitive intervention and necessary guidance to encourage CHWs. View details
    Cohorting to isolate asymptomatic spreaders: An agent-based simulation study on the Mumbai Suburban Railway
    Sharad Shriram
    Nidhin Vaidhiyan
    Gaurav Aggarwal
    Jiangzhuo Chen
    Srini Venkatramanan
    Lijing Wang
    Aniruddha Adiga
    Adam Sadilek
    Madhav Marathe
    Rajesh Sundaresan
    AMAAS 2021 (2021), pp. 1680
    Preview abstract The Mumbai Suburban Railways, \emph{locals}, are a key transit infrastructure of the city and is crucial for resuming normal economic activity. To reduce disease transmission, policymakers can enforce reduced crowding and mandate wearing of masks. \emph{Cohorting} -- forming groups of travelers that always travel together, is an additional policy to reduce disease transmission on \textit{locals} without severe restrictions. Cohorting allows us to: (i) form traveler bubbles, thereby decreasing the number of distinct interactions over time; (ii) potentially quarantine an entire cohort if a single case is detected, making contact tracing more efficient, and (iii) target cohorts for testing and early detection of symptomatic as well as asymptomatic cases. Studying impact of cohorts using compartmental models is challenging because of the ensuing representational complexity. Agent-based models provide a natural way to represent cohorts along with the representation of the cohort members with the larger social network. This paper describes a novel multi-scale agent-based model to study the impact of cohorting strategies on COVID-19 dynamics in Mumbai. We achieve this by modeling the Mumbai urban region using a detailed agent-based model comprising of 12.4 million agents. Individual cohorts and their inter-cohort interactions as they travel on locals are modeled using local mean field approximations. The resulting multi-scale model in conjunction with a detailed disease transmission and intervention simulator is used to assess various cohorting strategies. The results provide a quantitative trade-off between cohort size and its impact on disease dynamics and well being. The results show that cohorts can provide significant benefit in terms of reduced transmission without significantly impacting ridership and or economic \& social activity. View details
    No Results Found