Natural Language Processing

Natural Language Processing (NLP) research at Google focuses on algorithms that apply at scale, across languages, and across domains. Our systems are used in numerous ways across Google, impacting user experience in search, mobile, apps, ads, translate and more.

Our work spans the range of traditional NLP tasks, with general-purpose syntax and semantic algorithms underpinning more specialized systems. We are particularly interested in algorithms that scale well and can be run efficiently in a highly distributed environment.

Our syntactic systems predict part-of-speech tags for each word in a given sentence, as well as morphological features such as gender and number. They also label relationships between words, such as subject, object, modification, and others. We focus on efficient algorithms that leverage large amounts of unlabeled data, and recently have incorporated neural net technology.

On the semantic side, we identify entities in free text, label them with types (such as person, location, or organization), cluster mentions of those entities within and across documents (coreference resolution), and resolve the entities to the Knowledge Graph.

Recent work has focused on incorporating multiple sources of knowledge and information to aid with analysis of text, as well as applying frame semantics at the noun phrase, sentence, and document level.

Recent Publications

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Uri Shaham

Jonathan Herzig

Roee Aharoni

Idan Szpektor

Reut Tsarfaty

Matan Eyal

arXiv (2024)

Investigating Content Planning for Navigating Trade-offs in Knowledge-Grounded Dialogue

Kushal Chawla

Hannah Rashkin

Gaurav Singh Tomar

David Reitter

EACL (2024) (to appear)

Now You See Me, Now You Don't: 'Poverty of the Stimulus' Problems and Arbitrary Correspondences in End-to-End Speech Models

Daan van Esch

Proceedings of the Second Workshop on Computation and Written Language (CAWL) 2024

How Does Beam Search improve Span-Level Confidence Estimation in Generative Sequential Labeling?

Kazuma Hashimoto

Iftekhar Naim

Karthik Raman

EACL 2024 workshop on UncertaiNLP

Demystifying Embedding Spaces using Large Language Models

Guy Tennenholtz

Yinlam Chow

Chih-wei Hsu

Jihwan Jeong

Lior Shani

Aza Tulepbergenov

Deepak Ramachandran

Martin Mladenov

Craig Boutilier

The Twelfth International Conference on Learning Representations (2024)

Learning to Rewrite Prompts for Personalized Text Generation

Cheng Li

Mingyang Zhang

Qiaozhu Mei

Weize Kong

Michael Bendersky

Proceedings of the ACM Web Conference 2024

Defining the technology of today and tomorrow.

Philosophy

People

Research areas

Foundational ML & Algorithms

Computing Systems & Quantum AI

Science, AI & Society

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Natural Language Processing

Recent Publications

Some of our teams

Join us