Pannag R Sanketi
Research Areas
Authored Publications
Sort By
Agile Catching with Whole-Body MPC and Blackbox Policy Learning
Saminda Abeyruwan
Nick Boffi
Anish Shankar
Jean-Jacques Slotine
Stephen Tu
Learning for Dynamics and Control (2023)
Preview abstract
We address a benchmark task in agile robotics: catching objects thrown at high-speed. This is a
challenging task that involves tracking, intercepting, and cradling a thrown object with access only to visual observations of the object and the proprioceptive state of the robot, all within a fraction of a second. We present the relative merits of two fundamentally different solution strategies: (i) Model Predictive Control using accelerated constrained trajectory optimization, and (ii) Reinforcement Learning using zeroth-order optimization. We provide insights into various performance tradeoffs including sample efficiency, sim-to-real transfer, robustness to distribution shifts, and wholebody multimodality via extensive on-hardware experiments. We conclude with proposals on fusing “classical” and “learning-based” techniques for agile robot control. Videos of our experiments may be found here: https://sites.google.com/view/agile-catching.
View details
Robotic Table Tennis: A Case Study into a High Speed Learning System
Jon Abelian
Saminda Abeyruwan
Michael Ahn
Justin Boyd
Erwin Johan Coumans
Omar Escareno
Wenbo Gao
Navdeep Jaitly
Juhana Kangaspunta
Satoshi Kataoka
Gus Kouretas
Yuheng Kuang
Corey Lynch
Thinh Nguyen
Ken Oslund
Barney J. Reed
Anish Shankar
Avi Singh
Grace Vesom
Peng Xu
Robotics: Science and Systems (2023)
Preview abstract
We present a deep-dive into a learning robotic system that, in previous work, was shown to be capable of hundreds of table tennis rallies with a human and has the ability to precisely return the ball to desired targets. This system puts together a highly optimized and novel perception subsystem, a high-speed low-latency robot controller, a simulation paradigm that can prevent damage in the real world and also train policies for zero-shot transfer, and automated real world environment resets that enable autonomous training and evaluation on physical robots. We complement a complete system description including numerous design decisions that are typically not widely disseminated, with a collection of ablation studies that clarify the importance of mitigating various sources of latency, accounting for training and deployment distribution shifts, robustness of the perception system, and sensitivity to policy hyper-parameters and choice of action space. A video demonstrating the components of our system and details of experimental results is included in the supplementary material.
View details
i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops
Saminda Wishwajith Abeyruwan
Avi Singh
Anish Shankar
Conference on Robot Learning (Oral) (2022)
Preview abstract
Sim-to-real transfer is a powerful paradigm for robotic reinforcement learning. The ability to train policies in simulation enables safe exploration and large-scale data collection quickly at low cost. However, prior works in sim-to-real transfer of robotic policies typically do not involve any human-robot interaction because accurately simulating human behavior is an open problem. In this work, our goal is to leverage the power of simulation to train robotic policies that are proficient at interacting with humans upon deployment. This presents a chicken-and-egg problem --- how to gather examples of a human interacting with a physical robot so as to model human behavior in simulation without already having a robot that is able to interact with a human? Our proposed method, Iterative-Sim-to-Real i-S2R), attempts to address this. i-S2R bootstraps from a simple model of human behavior and alternates between training in simulation and deploying in the real world. In each iteration, both the human behavior model and the policy are improved. We evaluate our method on a real world robotic table tennis setting, where the objective for the robot is to play cooperatively with a human player for as long as possible. Table tennis is a high-speed, dynamic task that requires the two players to react quickly to each other’s moves, making for a challenging test bed for research on human-robot interaction. We present results on a physical industrial robotic arm that is able to cooperatively play table tennis against human players, achieving rallies of 22 successive hits on average and 150 at best. Further, for 80% of players, rally lengths are 70% to 175% longer compared to the sim-to-real (S2R) baseline.
View details
GoalsEye: Learning High Speed Precision Table Tennis on a Physical Robot
Saminda Wishwajith Abeyruwan
Anish Shankar
Corey Harrison Lynch
International Conference on Intelligent Robots and Systems (IROS) (2022)
Preview abstract
Learning goal conditioned control in the real world is a challenging open problem in robotics. Reinforcement learning systems have the potential to learn autonomously via trial-and-error, but in practice the costs of manual reward design, ensuring safe exploration, and hyperparameter tuning are often enough to preclude real world deployment. Imitation learning approaches, on the other hand, offer a simple way to learn control in the real world, but typically require costly curated demonstration data and lack a mechanism for continuous improvement. Recently, iterative imitation techniques have been shown to learn goal directed control from undirected demonstration data, and improve continuously via self-supervised goal reaching, but results thus far have been limited to simulated environments. In this work, we present evidence that iterative imitation learning can scale to goal-directed behavior on a real robot in a dynamic setting: high speed, precision table tennis (e.g. "land the ball on this particular target"). We find that this approach offers a straightforward way to do continuous on-robot learning, without complexities such as reward design or sim-to-real transfer. It is also scalable -- sample efficient enough to train on a physical robot in just a few hours. In real world evaluations, we find that the resulting policy can perform on par or better than amateur humans (with players sampled randomly from a robotics lab) at the task of returning the ball to specific targets on the table. Finally, we analyze the effect of an initial undirected bootstrap dataset size on performance, finding that a modest amount of unstructured demonstration data provided up-front drastically speeds up the convergence of a general purpose goal-reaching policy.
View details
Robotic Table Tennis with Model-Free Reinforcement Learning
Wenbo Gao
Navdeep Jaitly
International Conference on Intelligent Robots and Systems (IROS) (2020)
Preview abstract
We propose a model-free algorithm for learning efficient policies capable of returning table tennis balls by controlling robot joints at a rate of 100Hz. We demonstrate that evolutionary search (ES) methods acting on CNN-based policy architectures for non-visual inputs and convolving across time learn compact controllers leading to smooth motions. Furthermore, we show that with appropriately tuned curriculum learning on the task and rewards, policies are capable of developing multi-modal styles, specifically forehand and backhand stroke, whilst achieving 80\% return rate on a wide range of ball throws. We observe that multi-modality does not require any architectural priors, such as multi-head architectures or hierarchical policies.
View details