GoalsEye: Learning High Speed Precision Table Tennis on a Physical Robot

Tianli Ding; Laura Graesser; Saminda Wishwajith Abeyruwan; David B. D'Ambrosio; Anish Shankar; Pierre Sermanet; Pannag Raghunath Sanketi; Corey Harrison Lynch

GoalsEye: Learning High Speed Precision Table Tennis on a Physical Robot

Tianli Ding

Laura Graesser

Saminda Wishwajith Abeyruwan

David B. D'Ambrosio

Anish Shankar

Pierre Sermanet

Pannag Raghunath Sanketi

Corey Harrison Lynch

International Conference on Intelligent Robots and Systems (IROS) (2022)

Download Google Scholar

Abstract

Learning goal conditioned control in the real world is a challenging open problem in robotics. Reinforcement learning systems have the potential to learn autonomously via trial-and-error, but in practice the costs of manual reward design, ensuring safe exploration, and hyperparameter tuning are often enough to preclude real world deployment. Imitation learning approaches, on the other hand, offer a simple way to learn control in the real world, but typically require costly curated demonstration data and lack a mechanism for continuous improvement. Recently, iterative imitation techniques have been shown to learn goal directed control from undirected demonstration data, and improve continuously via self-supervised goal reaching, but results thus far have been limited to simulated environments. In this work, we present evidence that iterative imitation learning can scale to goal-directed behavior on a real robot in a dynamic setting: high speed, precision table tennis (e.g. "land the ball on this particular target"). We find that this approach offers a straightforward way to do continuous on-robot learning, without complexities such as reward design or sim-to-real transfer. It is also scalable -- sample efficient enough to train on a physical robot in just a few hours. In real world evaluations, we find that the resulting policy can perform on par or better than amateur humans (with players sampled randomly from a robotics lab) at the task of returning the ball to specific targets on the table. Finally, we analyze the effect of an initial undirected bootstrap dataset size on performance, finding that a modest amount of unstructured demonstration data provided up-front drastically speeds up the convergence of a general purpose goal-reaching policy.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

GoalsEye: Learning High Speed Precision Table Tennis on a Physical Robot

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs