Catalyzing scientific impact through global partnerships and open resources

May 1, 2026

The Google Research Science team

Our approach to open science is built on principles of responsible, inclusive, and rigorous research, empowering a global community to drive high-impact discoveries across disciplines and accelerate progress for all.

A scientific breakthrough reaches its full potential only when it empowers others to replicate and expand upon findings, pushing the boundaries of science even further. At Google Research, we recognize that open-source software and open-access datasets are drivers of modern science. We believe that creating these resources responsibly and maintaining them through partnerships with the global scientific community embodies the spirit of collaboration. In this way, we uphold the principles of open science, ensuring that innovation is not a siloed event but a catalyst for worldwide progress.

Whether it’s the Transformer architecture that reshaped automated language processing, or our specialized models transforming medicine, genomics, neuroscience, climate, energy, and a host of other efforts across the physical, life, and social sciences, we are proud of the work we’ve shared and how it’s being used by researchers around the globe to unlock their own groundbreaking discoveries. This open approach complements our breadth of initiatives across Google to engage and strengthen the research and science ecosystem, including through APIs, publications, conferences, trusted tester programs and private partnerships.

Partnerships and ecosystem collaboration

We collaborate with numerous specialized organizations across scientific disciplines and global regions, such as the University of California Santa Cruz (UCSC) Genomics Institute, Janelia Research Campus, Institute of Science & Technology Austria (ISTA), the Centre for Population Genomics, CSIRO - Australia’s national science agency, and the All India Institute of Medical Sciences (AIIMS).

Beyond individual organizations, we actively support widespread scientific consortia undertaking monumental, global challenges, including the Human Pangenome Research Consortium, the Earth BioGenome Project and the NIH BRAIN Initiative.

Ultimately, our open-science philosophy extends to the broader ecosystem and we are investing in building communities of practice for individual scientific developers, starting in India, Korea, Japan and Australia.

Our open-source tools and data

Over the last decade, we have developed, released, maintained and evolved several key open-source technologies and open access datasets. To date these have empowered an active ecosystem of more than 250,000 researchers and developers worldwide.

  • Genomics: Our suite of deep learning tools, including DeepVariant, DeepConsensus and DeepPolisher, improve DNA analysis from raw sequencing to final assemblies. These methods have collectively enabled the global community to process the exomes and whole genomes of 2.5 million individuals.

  • Neuroscience: Our methods and tools for automated reconstruction, analysis, and visualization of connectomic data include flood-filling networks, Neuroglancer, and TensorStore. These technologies allow scientists to seamlessly segment, navigate, and analyze petascale, high-resolution brain tissue reconstructions. This includes two key publicly available datasets: H01, a 1.4 petabyte sample of human brain tissue accessed over 200k times, and MICrONS, the largest wiring diagram and functional map of the mouse visual cortex.

  • Earth & Atmospheric Modeling: We have released Open Buildings, which contains 1.8 billion building detections, across an inference area of 58M km2 covering Africa, South Asia, South-East Asia, Latin America and the Caribbean; Caravan, a community-driven dataset for large-sample hydrology, as part of our flood forecasting effort which now provides prediction in 150 countries covering 2B people for the most significant floods, and the Groundsource dataset for urban flash floods, comprising of 2.6 million historical flood events derived from Gemini on 20 years of public data spanning more than 150 countries; and NeuralGCM, a fully differentiable hybrid atmospheric model. These are also part of our geospatial efforts within Google Earth AI. We have also released FireBench, a high-resolution, synthetic dataset designed to advance wildfire research and a dataset of ionosphere conditions measured using phones, along with a paired visualization of the dataset over time.

  • Biodiversity: SpeciesNet is a global-scale model that classifies 2,498 animal categories, including mammals, birds, and reptiles in wildlife camera images.

  • Healthcare: Our Health AI Developer Foundations (HAI-DEF) provides a suite of open-weight foundation models — including MedGemma — specialized for multimodal medical text, clinical reasoning, and imaging comprehension. It has more than 4.8M downloads to date. Open Health Stack (OHS) is a suite of open-source tools that make it faster and easier for developers to build secure, offline-capable next-generation digital health solutions based on modern digital healthcare standards. Healthcare applications powered by OHS have been deployed in more than 10 countries with over 65 million beneficiaries.
Open_science_1

An image from the human brain fragment reconstruction in which a single neuron (white) receives signals that determine whether or not the neuron fires. This image shows all of the axons that can tell it to fire (green) and all of those that can tell it not to (blue). Credit: Google Research & Lichtman Lab (Harvard University). Renderings by D. Berger (Harvard University)

Real-world impact powered by open science

The true measure of our open-science philosophy is the real-world impact achieved by our partners and end users. Below are some examples detailing how our open tools and datasets have enabled further breakthroughs and been used to help communities across the globe.

Enabling global science

Video preview image

Watch the film

African nonprofit Sunbird AI uses Google’s Open Buildings dataset to better understand the energy needs of communities in urban and rural areas.

Enabling health advances

  • HAI-DEF has driven widespread global engagement and tangible clinical impact by providing open-weight models that democratize medical AI development, especially in low- and middle-income countries. For instance, Zambia-based Dawa Health used MedSigLP to build an AI-powered multilingual cervical cancer education and screening tool that allows midwives to upload colposcopy images via WhatsApp to identify abnormalities in real time.

  • Open Health Stack has enabled developers globally to address healthcare gaps, particularly in low resource settings. For example, Ona builds apps that allow health workers to switch from paper-based records to digital solutions. OHS accelerated Ona’s app development and allowed them to adopt interoperable data standards, which healthcare workers then used to deliver better care in underserved communities.

  • In New Delhi, AIIMS is using MedGemma to develop applications for outpatient triage and dermatology screening. In Malaysia, MedGemma powers Ask CPG, a conversational interface to the country’s 150+ clinical practice guidelines that the Ministry of Health in Malaysia said has eased navigating the country’s clinical practice guidelines for day-to-day decision support. MedGemma is also empowering individual developers worldwide to build applications for clinical triage, medical document understanding, and diagnostic decision support.
Video preview image

Watch the film

AIIMS is using MedGemma to develop applications for outpatient triage and dermatology screening.

Enabling biodiversity and conservation

  • Since 2010, the Snapshot Serengeti camera trapping program has captured over 11 million wildlife images from the African savanna. Using SpeciesNet, researchers at Wake Forest University can now analyze this large dataset in just days, and by running the model from a laptop, they can use the latest wildlife sightings to redeploy cameras in real time to collect targeted data.

  • Researchers at the University of Otago are working to preserve the critically endangered kākāpō, a flightless bird of significant cultural importance. Working independently of Google, the researchers re-trained DeepVariant to optimize it for the kākāpō population. This model enabled them to create a genetic map of every living kākāpō to inform breeding strategies and care plans for sick birds, helping to expand the population from a low of 51 to 252 birds.

  • Researchers at CSIRO are working with Google to support repopulation efforts for endangered Australian and Tasmanian giant kelp populations. By using Google Earth models and satellite imagery to identify surviving patches, and Google’s open genomics tools to create reference genomes, researchers are linking genetic variants to heat tolerance data. This allows researchers to selectively breed kelp strains that are resilient to rising ocean temperatures.

Open_science_2

Images of the elephant, zebra and secretary bird were captured by the Snapshot Serengeti program in Tanzania’s Serengeti National Park. Credit: Snapshot Serengeti / T.M. Anderson. The image of the ocelot was captured in Colombia by Project Lucitania at the Universidad de los Andes. Credit: Project Lucitania/Universidad de los Andes/Red Otus. The image of the mule deer was captured by the Idaho Department of Fish and Game (IDFG). Credit: IDFG. SpeciesNet can help identify these animals.

Looking ahead

Our partnership with the open science community is an accelerating mission. As we transition deeper into the era of AI-enabled science, we are inspired by the way generative AI is profoundly changing how researchers work and collaborate. We believe that agentic workflows will allow scientists to encode their knowledge into specialized skills and transform their methods into accessible, scalable tools. This shift will empower the global community to rapidly reproduce findings, extend complex methodologies, and share their work globally.

In this fast-paced new paradigm, communication and collaboration are more critical than ever. Open-source software and open datasets serve as the essential foundation for this ecosystem. The breakthroughs we celebrate today are merely the initial blueprints for a world with faster innovation and universal sharing of scientific knowledge.

At Google Research, we will continue to build the tools and infrastructure that support this new era of discovery. We look forward to seeing what the global scientific community achieves next.

Acknowledgments

We give special thanks to our many global research partners and to the wider scientific community of users that builds upon our open models, infrastructure, datasets, and other tools to make discoveries and to pioneer, pilot, and implement innovations that create positive global societal impact.

×
×