We advance the state of the art in natural language technologies and build systems that learn to understand and use language in context.
About the team
Our team comprises multiple research groups working on a range of Language projects. We collaborate closely with teams across Google, leveraging efficient algorithms, neural networks, and graphical and probabilistic models to help guide product development and direction. In doing so, the Language team enables natural and assistive communication with users, finds answers to user questions, analyzes app store reviews for developers, and more.
Our researchers are experts in traditional natural language processing and machine learning, and combine methodological research with applied science. All of our Language engineers are equally involved in long-term research efforts and driving immediate applications of our technology. Our systems also benefit greatly from Google linguists, who provide valuable labelled data and assist in enabling internationalization.
Recent research interests of the Language team include syntax, discourse, conversation, multilingual modeling, sentiment analysis, question answering, summarization, and generally building better learners using labeled and unlabeled data, state-of-the-art modeling, and indirect supervision.
To help spur development in open-domain question answering, we have created the Natural Questions (NQ) corpus, along with a challenge website based on this data.
We released TensorFlow code and models for BERT, a novel pre-training technique which achieves state-of-the-art results on 11 natural language processing tasks.
Active Question Answering (ActiveQA) is a TensorFlow package that investigates using reinforcement learning to train artificial agents for question answering.
We released a new dataset consisting of ~3.3 million image/caption pairs and an image captioning challenge for the ML community to train and evaluate their own models on the Conceptual Captions test bed.
Based on our examination of the use of Smart Reply in Inbox and our ideas about how humans learn and use language, we have created a new version of Smart Reply for Gmail.
Learn language representations that capture meaning at various levels of granularity, shared and resuable across domains.
Use state-of-the-art Machine Learning techniques and large-scale infrastructure to break language barriers and offer human quality translations across many languages to make it possible to easily explore multilingual world.
Learn end-to-end models for real world question answering that require complex reasoning about concepts, entities, relations, and causality in the world.
Learn document representations from geometric features and spatial relations, multi-modal content features, syntactic, semantic and pragmatic signals.
Advance next generation dialogue systems in human-machine and multi-human-machine interactions to achieve natural user interactions and enrich conversations between human users.
Learn to summarize single and multiple documents into cohesive and concise summaries that accurately represent the documents.
Writing / generation
Produce natural and fluent output for spoken and written text for different domains and styles.
Sensitive content detection
Learn end-to-end models of offensive, inappropriate and controversial content in text.
i18n for query and conversation understanding
Learn to extend models to support new languages easily, and deal effectively with mixed-code language.
Speech and language algorithms
Represent, combine, and optimize models for speech to text and text to speech.
Language & vision
Understand visual inputs (image & video) and express that understanding using fluent natural language (phrases, sentences, paragraphs).
Some of our people
Most of Google’s users interact with us through language. Working on the Language team means you get to play a critical role in helping our systems understand what users want.
The Language team provides opportunities to work on ambitious research projects and to share successes along the way with products and the academic community.