Rajarishi Sinha
Raj Sinha is a Research Engineer at Google Cloud AI Research. Before joining Cloud AI Research, he worked on a Cloud AI applied ML team. Prior to joining Google, he was a Research Scientist and Co-Founder at a MEMS (Microelectromechanical Systems) and analog IC startup. He received his PhD in Engineering from Carnegie Mellon University. He has broad interests, including natural language processing, robotics and machine learning theory.
Authored Publications
Sort By
SQLPrompt: Improved In-context Learning for Few-shot Text-to-SQL
Findings of Conference on Empirical Methods in Natural Language Processing (EMNLP) (2023)
Preview abstract
Text-to-SQL aims to automate the process of generating SQL queries on a database from natural language text. In this work, we propose "SQLPrompt", tailored to improve the few-shot prompting capabilities of Text-to-SQL for Large Language Models (LLMs). Our methods
include innovative prompt design, execution based consistency decoding strategy which selects the SQL with the most consistent execution outcome among other SQL proposals, and a method that aims to improve performance by diversifying the SQL proposals during consistency selection with different prompt designs ("MixPrompt") and foundation models ("MixLLMs"). We show that SQLPrompt outperforms previous approaches for in-context learning with few labeled data by a large margin, closing the gap with finetuning state-of the-art with thousands of labeled data.
View details
A prospective evaluation of AI-augmented epidemiology to forecast COVID-19 in the USA and Japan
Joel Shor
Arkady Epshteyn
Ashwin Sura Ravi
Beth Luan
Chun-Liang Li
Daisuke Yoneoka
Dario Sava
Hiroaki Miyata
Hiroki Kayama
Isaac Jones
Joe Mckenna
Johan Euphrosine
Kris Popendorf
Nate Yoder
Shashank Singh
Shuhei Nomura
Thomas Tsai
npj Digital Medicine (2021)
Preview abstract
The COVID-19 pandemic has highlighted the global need for reliable models of disease spread. We evaluate an AI-improved forecasting approach that provides daily predictions of the expected number of confirmed COVID-19 deaths, cases and hospitalizations during the following 28 days. We present an international, prospective evaluation of model performance across all states and counties in the USA and prefectures in Japan. National mean absolute percentage error (MAPE) for predicting COVID-19 associated deaths before and after prospective deployment remained consistently <3% (US) and <10% (Japan). Average statewide (US) and prefecture wide (Japan) MAPE was 6% and 20% respectively (14% when looking at prefectures with more than 10 deaths).We show our model performs well even during periods of considerable change in population behavior, and that it is robust to demographic differences across different geographic locations.We further demonstrate the model provides meaningful explanatory insights, finding that the model appropriately responds to local and national policy interventions. Our model enables counterfactual simulations, which indicate continuing NPIs alongside vaccinations is essential for more rapidly recovering from the pandemic, delaying the application of interventions has a detrimental effect, and allow exploration of the consequences of different vaccination strategies. The COVID-19 pandemic remains a global emergency. In the face of substantial challenges ahead, the approach presented here has the potential to inform critical decisions.
View details
Interpretable Sequence Learning for Covid-19 Forecasting
Chun-Liang Li
Arkady Epshteyn
Shashank Singh
Martin Nikoltchev
Yash Kumar Sonthalia
NeurIPS (2020)
Preview abstract
We propose a novel model that integrates machine learning into compartmental disease modeling to predict the progression of Covid-19. Our model incorporates explainable encoding of information-bearing covariates to improve performance. The motivation to maintain explainability is two-fold: the behavior of the resulting model will be credible with epidemiologists, and will instill confidence in the intended end-users - policy makers and healthcare institutions. The proposed model can be applied at different geographic resolutions, and we demonstrate it for United States' states and counties. We show that the forecasting accuracy of our model is significantly better than the alternatives, and the explanatory insights from it are qualitatively meaningful.
View details