Google Research

PanGEA: The Panoramic Graph Environment Annotation Toolkit

2nd Workshop on Advances in Language and Vision Research (ALVR) (2021)

Abstract

PanGEA, the Panoramic Graph Environment Annotation toolkit, is a lightweight toolkit for collecting speech and text annotations in photo-realistic 3D environments. PanGEA immerses annotators in a web-based simulation and allows them to move around easily as they speak and/or listen. It includes database and cloud storage integration, plus utilities for automatically aligning recorded speech with manual transcriptions and the virtual pose of the annotators. Out of the box, PanGEA supports two tasks -- collecting navigation instructions and navigation instruction following -- and it could be easily adapted for annotating walking tours, finding and labeling landmarks or objects, and similar tasks. We share best practices learned from using PanGEA in a 20,000 hour annotation effort to collect the Room-Across-Room (RxR) dataset. We hope that our open-source annotation toolkit and insights will both expedite future data collection efforts and spur innovation on the kinds of grounded language tasks such environments can support.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work