Publications

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

people standing in front of a screen with images and a chipboard

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field.

Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
1 - 15 of 10310 publications
    Preview abstract This paper presents SYMBIOSIS, an AI-powered framework to make Systems Thinking accessible for addressing societal challenges and unlock paths for leveraging systems thinking framework to improve AI systems. The platform establishes a centralized, open-source repository of systems thinking/system dynamics models categorized by Sustainable Development Goals (SDGs) and societal topics using topic modeling and classification techniques. Systems Thinking resources, though critical for articulating causal theories in complex problem spaces, are often locked behind specialized tools and intricate notations, creating high barriers to entry. To address this, we developed a generative co-pilot that translates complex systems representations - such as causal loops and stock-flow diagrams - into natural language (and vice-versa), allowing users to explore and build models without extensive technical training. Rooted in community-based system dynamics (CBSD) and informed by community-driven insights on societal context, we aim to bridge the problem understanding chasm. This gap, driven by epistemic uncertainty, often limits ML developers who lack the community-specific knowledge essential for problem understanding and formulation, often leading to misaligned causal theories and reduced intervention effectiveness. Recent research identifies causal and abductive reasoning as crucial frontiers for AI, and Systems Thinking provides a naturally compatible framework for both. By making Systems Thinking frameworks more accessible and user-friendly, we aim to serve as a foundational step to unlock future research into Responsible and society-centered AI that better integrates societal context leveraging systems thinking framework and models. Our work underscores the need for ongoing research into AI's capacity essential system dynamics such as feedback processes and time delays, paving the way for more socially attuned, effective AI systems. View details
    Gemini & Physical World: Large Language Models Can Estimate the Intensity of Earthquake Shaking from Multi-Modal Social Media Posts
    Marc Stogaitis
    Tajinder Gadh
    Richard Allen
    Alexei Barski
    Robert Bosch
    Patrick Robertson
    Youngmin Cho
    Nivetha Thiruverahan
    Aman Raj
    Geophysical Journal International (2025), ggae436
    Preview abstract This paper presents a novel approach for estimating the ground shaking intensity using real-time social media data and CCTV footage. Employing the Gemini 1.5 Pro’s (Reid et al. 2024) model, a multi-modal language model, we demonstrate the ability to extract relevant information from unstructured data utilizing generative AI and natural language processing. The model’s output, in the form of Modified Mercalli Intensity (MMI) values, align well with independent observational data. Furthermore, our results suggest that beyond its advanced visual and auditory understanding abilities, Gemini appears to utilize additional sources of knowledge, including a simplified understanding of the general relationship between earthquake magnitude, distance, and MMI intensity, which it presumably acquired during its training, in its reasoning and decision-making processes. These findings raise intriguing questions about the extent of Gemini's general understanding of the physical world and its phenomena. Gemini’s ability to generate results consistent with established scientific knowledge highlights the potential of LLMs like Gemini in augmenting our understanding of complex physical phenomena such as earthquakes. More specifically, the results of this study highlight the potential of LLMs like Gemini to revolutionize citizen seismology by enabling rapid, effective, and flexible analysis of crowdsourced data from eyewitness accounts for assessing earthquake impact and providing crisis situational awareness. This approach holds a great promise for improving early warning systems, disaster response, and overall resilience in earthquake-prone regions. This study provides a significant step toward harnessing the power of social media and AI for earthquake disaster mitigation. View details
    Linear Elastic Caching via Ski Rental
    Todd Lipcon
    The biennial Conference on Innovative Data Systems Research (2025)
    Preview abstract In this work we study the Linear Elastic Caching problem, where the goal is to minimize the total cost of a cache inclusive of not just its misses, but also its memory footprint integrated over time. We demonstrate a theoretical connection to the classic ski rental problem and propose a practical algorithm that combines online caching algorithms with ski rental policies. We also introduce a lightweight machine learning-based algorithm for ski rental that is optimized for production workloads and is easy to integrate within existing database systems. Evaluations on both production workloads in Google Spanner and publicly available traces show that the proposed elastic caching approach can significantly reduce the total cache cost compared to traditional fixed-size cache policies. View details
    Oculomics: Current Concepts and Evidence
    Zhuoting Zhu
    Yueye Wang
    Ziyi Qi
    Wenyi Hu
    Xiayin Zhang
    Siegfried Wagner
    Yujie Wang
    An Ran Ran
    Joshua Ong
    Ethan Waisberg
    Mouayad Masalkhi
    Alex Suh
    Yih Chung Tham
    Carol Y. Cheung
    Xiaohong Yang
    Honghua Yu
    Zongyuan Ge
    Wei Wang
    Bin Sheng
    Andrew G. Lee
    Alastair Denniston
    Peter van Wijngaarden
    Pearse Keane
    Ching-Yu Cheng
    Mingguang He
    Tien Yin Wong
    Progress in Retinal and Eye Research (2025)
    Preview abstract The eye provides novel insights into general health, as well as pathogenesis and development of systemic diseases. In the past decade, growing evidence has demonstrated that the eye's structure and function mirror multiple systemic health conditions, especially in cardiovascular diseases, neurodegenerative disorders, and kidney impairments. This has given rise to the field of oculomics- the application of ophthalmic biomarkers to understand mechanisms, detect and predict disease. The development of this field has been accelerated by three major advances: 1) the availability and widespread clinical adoption of high-resolution and non-invasive ophthalmic imaging (“hardware”); 2) the availability of large studies to interrogate associations (“big data”); 3) the development of novel analytical methods, including artificial intelligence (AI) (“software”). Oculomics offers an opportunity to enhance our understanding of the interplay between the eye and the body, while supporting development of innovative diagnostic, prognostic, and therapeutic tools. These advances have been further accelerated by developments in AI, coupled with large-scale linkage datasets linking ocular imaging data with systemic health data. Oculomics also enables the detection, screening, diagnosis, and monitoring of many systemic health conditions. Furthermore, oculomics with AI allows prediction of the risk of systemic diseases, enabling risk stratification, opening up new avenues for prevention or individualized risk prediction and prevention, facilitating personalized medicine. In this review, we summarise current concepts and evidence in the field of oculomics, highlighting the progress that has been made, remaining challenges, and the opportunities for future research. View details
    Preview abstract Measuring productivity is equivalent to building a model. All models are wrong, but some are useful. Productivity models are often “worryingly selective” (wrong because of omissions). Worrying selectivity can be combated by taking a holistic approach that includes multiple measurements of multiple outcomes. Productivity models should include multiple outcomes, metrics, and methods. View details
    Preview abstract Recent work suggested utilizing inference compute, showing that scaling of number of samples consistently improves the fractions of problems solved by any attempt, namely the coverage. In this work, we suggest that inference scaling gains should be compared with proper baselines, as some datasets become degenerate when allowing a large number of attempts. We focus on two domains - mathematical reasoning and factual knowledge, showing that for the MATH and Entity Questions datasets, informed answer enumeration obtains similar or even better results than repeated model sampling, with a much lower sample budget. While we believe that inference scaling is a promising approach for unlocking the potential of language models, we recommend carefully selecting models and datasets when applying this method. Otherwise, the results of inference scaling should be interpreted with caution. View details
    Enhancing Remote Sensing Representations through Mixed-Modality Masked Autoencoding
    Ori Linial
    George Leifman
    Yochai Blau
    Nadav Sherman
    Yotam Gigi
    Wojciech Sirko
    Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops (2025), pp. 507-516
    Preview abstract This paper presents an innovative approach to pre-training models for remote sensing by integrating optical and radar data from Sentinel-2 and Sentinel-1 satellites. Using a novel variation on the masked autoencoder (MAE) framework, our model incorporates a dual-task setup: reconstructing masked Sentinel-2 images and predicting corresponding Sentinel-1 images. This multi-task design enables the encoder to capture both spectral and structural features across diverse environmental conditions. Additionally, we introduce a "mixing" strategy in the pretraining phase, combining patches from both image sources, which mitigates spatial misalignment errors and enhances model robustness. Evaluation on segmentation and classification tasks, including Sen1Floods11 and BigEarthNet, demonstrates significant improvements in adaptability and generalizability across varied downstream remote sensing applications. Our findings highlight the advantages of leveraging complementary modalities for more resilient and versatile land cover analysis. View details
    Society-Centric Product Innovation in the Era of Customer Obsession
    International Journal of Science and Research Archive (IJSRA), Volume 14 - Issue 1 (2025)
    Preview abstract This article provides a comprehensive analysis of the evolving landscape of innovation in the technology sector, with a focus on the intersection of technological progress and social responsibility. The article explores key challenges facing the industry, including public trust erosion, digital privacy concerns, and the impact of automation on workforce dynamics. It investigates responsible innovation frameworks' emergence and implementation across various organizations, highlighting the transformation from traditional development approaches to more society-centric models. The article demonstrates how companies balance innovation speed with social responsibility, incorporate ethical considerations into their development processes, and address digital disparities across different demographics. By examining how companies balance the pace of innovation with ethical responsibilities, integrate social considerations into their processes, and address digital inequities across diverse demographics, the article underscores the transformative potential of these frameworks. Through insights into cross-functional teams, impact assessment tools, and stakeholder engagement strategies, it demonstrates how responsible innovation drives both sustainable business value and societal progress. View details
    Shadow Hamiltonian Simulation
    Rolando Somma
    Robbie King
    Tom O'Brien
    Nature Communications, 16 (2025), pp. 2690
    Preview abstract Simulating quantum dynamics is one of the most important applications of quantum computers. Traditional approaches for quantum simulation involve preparing the full evolved state of the system and then measuring some physical quantity. Here, we present a different and novel approach to quantum simulation that uses a compressed quantum state that we call the "shadow state". The amplitudes of this shadow state are proportional to the time-dependent expectations of a specific set of operators of interest, and it evolves according to its own Schrödinger equation. This evolution can be simulated on a quantum computer efficiently under broad conditions. Applications of this approach to quantum simulation problems include simulating the dynamics of exponentially large systems of free fermions or free bosons, the latter example recovering a recent algorithm for simulating exponentially many classical harmonic oscillators. These simulations are hard for classical methods and also for traditional quantum approaches, as preparing the full states would require exponential resources. Shadow Hamiltonian simulation can also be extended to simulate expectations of more complex operators such as two-time correlators or Green's functions, and to study the evolution of operators themselves in the Heisenberg picture. View details
    A Recipe for Improving Remote Sensing Zero Shot Generalization
    Aviad Barzilai
    Yotam Gigi
    Vered Silverman
    Yehonathan Refael
    Bolous Jaber
    Amr Helmy
    3rd ML4RS Workshop at ICLR 2025
    Preview abstract Foundation models have had a significant impact across various AI applications, enabling applications for use cases that were previously impossible. Visual language models (VLMs), in particular, have outperformed other techniques in many tasks. In remote sensing (RS), foundation models have shown improvements across various applications. However, unlike other fields, the use of VLMs with large-scale remote sensing image-text datasets remains limited. In this work, we first introduce two novel image-caption datasets for training of remote sensing foundation models. The first dataset pairs aerial and satellite imagery, aligned with Google-Maps data, with high-quality captions generated using Gemini. The second utilizes public web images and their corresponding alt-text, filtered for only remote sensing domain, resulting in a highly diverse dataset. We show that using these datasets to pre-train the Mammut [], a VLM architecture, results in state-of-the-art generalization performance in a zero-shot classification and cross-modal retrieval on well-known public benchmarks. Secondly, we leverage this newly pre-trained VLM to generate inference attention maps for a novel class query (i.e., a class unseen during training). We subsequently propose an iterative self-supervised fine-tuning approach where samples aligned with these attention maps are iteratively pseudo-labeled and utilized for model training. View details
    Development and Evaluation of ML Models for Cardiotocography Interpretation
    Nicole Chiou
    Nichole Young-Lin
    Abdoulaye Diack
    Christopher Kelly
    Sanmi Koyejo
    NPJ Women's Health (2025)
    Preview abstract The inherent variability in the visual interpretation of cardiotocograms (CTGs) by obstetric clinical experts, both intra- and inter-observer, presents a substantial challenge in obstetric care. In response, we investigate automated CTG interpretation as a potential solution to enhance the early detection of fetal hypoxia during labor, thereby reducing unnecessary operative interventions and improving overall maternal and neonatal care. This study employs deep learning techniques to reduce the subjectivity associated with visual CTG interpretation. Our results demonstrate that employing objective cord blood pH measurements, rather than clinician-defined Apgar scores, yields more consistent and robust model performance. Additionally, through a series of ablation studies, we investigate the impact of temporal distribution shifts on the performance of these deep learning models. We examine tradeoffs between performance and fairness, specifically evaluating performance across demographic and clinical subgroups. Finally, we discuss the practical implications of our findings for the real-world deployment of such systems, emphasizing their potential utility in medical settings with limited resources. View details
    Security Signals: Making Web Security Posture Measurable At Scale
    David Dworken
    Artur Janc
    Santiago (Sal) Díaz
    Workshop on Measurements, Attacks, and Defenses for the Web (MADWeb)
    Preview abstract The area of security measurability is gaining increased attention, with a wide range of organizations calling for the development of scalable approaches for assessing the security of software systems and infrastructure. In this paper, we present our experience developing Security Signals, a comprehensive system providing security measurability for web services, deployed in a complex application ecosystem of thousands of web services handling traffic from billions of users. The system collects security-relevant information from production HTTP traffic at the reverse proxy layer, utilizing novel concepts such as synthetic signals augmented with additional risk information to provide a holistic view of the security posture of individual services and the broader application ecosystem. This approach to measurability has enabled large-scale security improvements to our services, including prioritized rollouts of security enhancements and the implementation of automated regression monitoring. Furthermore, it has proven valuable for security research and prioritization of defensive work. Security Signals addresses shortcomings of prior web measurability proposals by tracking a comprehensive set of security properties relevant to web applications, and by extracting insights from collected data for use by both security experts and non-experts. We believe the lessons learned from the implementation and use of Security Signals offer valuable insights for practitioners responsible for web service security, potentially inspiring new approaches to web security measurability. View details
    AfriMed-QA: A Pan-African Multi-Specialty Medical Question-Answering Benchmark Dataset
    Tobi Olatunji
    Abraham Toluwase Owodunni
    Charles Nimo
    Jennifer Orisakwe
    Henok Biadglign Ademtew
    Chris Fourie
    Foutse Yuehgoh
    Stephen Moore
    Mardhiyah Sanni
    Emmanuel Ayodele
    Timothy Faniran
    Bonaventure F. P. Dossou
    Fola Omofoye
    Wendy Kinara
    Tassallah Abdullahi
    Michael Best
    2025
    Preview abstract Recent advancements in large language model (LLM) performance on medical multiple-choice question (MCQ) benchmarks have stimulated significant interest from patients and healthcare providers globally. Particularly in low- and middle-income countries (LMICs) facing acute physician shortages and lack of specialists, LLMs offer a potentially scalable pathway to enhance healthcare access and reduce costs. However, LLM training data is sourced from predominantly Western text, existing benchmarks are predominantly Western-centric, limited to MCQs, and focused on a narrow range of clinical specialties, raising concerns about their applicability in the Global South, particularly across Africa where localized medical knowledge and linguistic diversity are often underrepresented. In this work, we introduce AfriMed-QA, the first large-scale multi-specialty Pan-African medical Question-Answer (QA) dataset designed to evaluate and develop equitable and effective LLMs for African healthcare. It contains 3,000 multiple-choice professional medical exam questions with answers and rationale, 1,500 short answer questions (SAQ) with long-from answers, and 5,500 consumer queries, sourced from over 60 medical schools across 15 countries, covering 32 medical specialties. We further rigorously evaluate multiple open, closed, general, and biomedical LLMs across multiple axes including accuracy, consistency, factuality, bias, potential for harm, local geographic relevance, medical reasoning, and recall. We believe this dataset provides a valuable resource for practical application of large language models in African healthcare and enhances the geographical diversity of health-LLM benchmark datasets. View details
    Preview abstract This paper presents SYMBIOSIS, an AI-powered framework to make Systems Thinking accessible for addressing societal challenges and unlock paths for leveraging systems thinking framework to improve AI systems. The platform establishes a centralized, open-source repository of systems thinking/system dynamics models categorized by Sustainable Development Goals (SDGs) and societal topics using topic modeling and classification techniques. Systems Thinking resources, though critical for articulating causal theories in complex problem spaces, are often locked behind specialized tools and intricate notations, creating high barriers to entry. To address this, we developed a generative co-pilot that translates complex systems representations - such as causal loops and stock-flow diagrams - into natural language (and vice-versa), allowing users to explore and build models without extensive technical training. Rooted in community-based system dynamics (CBSD) and informed by community-driven insights on societal context, we aim to bridge the problem understanding chasm. This gap, driven by epistemic uncertainty, often limits ML developers who lack the community-specific knowledge essential for problem understanding and formulation, often leading to misaligned causal theories and reduced intervention effectiveness. Recent research identifies causal and abductive reasoning as crucial frontiers for AI, and Systems Thinking provides a naturally compatible framework for both. By making Systems Thinking frameworks more accessible and user-friendly, we aim to serve as a foundational step to unlock future research into Responsible and society-centered AI that better integrates societal context leveraging systems thinking framework and models. Our work underscores the need for ongoing research into AI's capacity essential system dynamics such as feedback processes and time delays, paving the way for more socially attuned, effective AI systems. View details
    Enhancing Remote Sensing Representations through Mixed-Modality Masked Autoencoding
    Ori Linial
    Yochai Blau
    Nadav Sherman
    Yotam Gigi
    Wojciech Sirko
    Proceedings of the Winter Conference on Applications of Computer Vision (WACV) Workshops (2025), pp. 507-516
    Preview abstract This paper presents an innovative approach to pre-training models for remote sensing by integrating optical and radar data from Sentinel-2 and Sentinel-1 satellites. Using a novel variation on the masked autoencoder (MAE) framework, our model incorporates a dual-task setup: reconstructing masked Sentinel-2 images and predicting corresponding Sentinel-1 images. This multi-task design enables the encoder to capture both spectral and structural features across diverse environmental conditions. Additionally, we introduce a "mixing" strategy in the pretraining phase, combining patches from both image sources, which mitigates spatial misalignment errors and enhances model robustness. Evaluation on segmentation and classification tasks, including Sen1Floods11 and BigEarthNet, demonstrates significant improvements in adaptability and generalizability across varied downstream remote sensing applications. Our findings highlight the advantages of leveraging complementary modalities for more resilient and versatile land cover analysis. View details