Sumeet Singh
Research Areas
Authored Publications
Sort By
Agile Catching with Whole-Body MPC and Blackbox Policy Learning
Saminda Abeyruwan
Nick Boffi
Anish Shankar
Jean-Jacques Slotine
Stephen Tu
Learning for Dynamics and Control (2023)
Preview abstract
We address a benchmark task in agile robotics: catching objects thrown at high-speed. This is a
challenging task that involves tracking, intercepting, and cradling a thrown object with access only to visual observations of the object and the proprioceptive state of the robot, all within a fraction of a second. We present the relative merits of two fundamentally different solution strategies: (i) Model Predictive Control using accelerated constrained trajectory optimization, and (ii) Reinforcement Learning using zeroth-order optimization. We provide insights into various performance tradeoffs including sample efficiency, sim-to-real transfer, robustness to distribution shifts, and wholebody multimodality via extensive on-hardware experiments. We conclude with proposals on fusing “classical” and “learning-based” techniques for agile robot control. Videos of our experiments may be found here: https://sites.google.com/view/agile-catching.
View details
Single-Level Differentiable Contact Simulation
Simon Le Cleac'h
Mac Schwager
Zachary Manchester
Pete Florence
IEEE RAL (2023)
Preview abstract
We present a differentiable formulation of rigid-body contact dynamics for objects and robots represented as compositions of convex primitives. Existing optimization-based approaches simulating contact between convex primitives rely on a bilevel formulation that separates collision detection and contact simulation. These approaches are unreliable in realistic contact simulation scenarios because isolating the collision detection problem introduces contact location non-uniqueness. Our approach combines contact simulation and collision detection into a unified single-level optimization problem. This disambiguates the collision detection problem in a physics-informed manner. Compared to previous differentiable simulation approaches, our formulation features improved simulation robustness and computational complexity improved by more than an order of magnitude. We provide a numerically efficient implementation of our formulation in the Julia language called \href{https://github.com/simon-lc/DojoLight.jl}{DojoLight.jl}.
View details
Robotic table wiping via whole-body trajectory optimizationand reinforcement learning
Benjie Holson
Jeffrey Bingham
Jonathan Weisz
Mario Prats
Peng Xu
Thomas Lew
Xiaohan Zhang
Yao Lu
ICRA (2022)
Preview abstract
We propose an end-to-end framework to enablemultipurpose assistive mobile robots to autonomously wipetables and clean spills and crumbs. This problem is chal-lenging, as it requires planning wiping actions with uncertainlatent crumbs and spill dynamics over high-dimensional visualobservations, while simultaneously guaranteeing constraintssatisfaction to enable deployment in unstructured environments.To tackle this problem, we first propose a stochastic differentialequation (SDE) to model crumbs and spill dynamics and ab-sorption with the robot wiper. Then, we formulate a stochasticoptimal control for planning wiping actions over visual obser-vations, which we solve using reinforcement learning (RL). Wethen propose a whole-body trajectory optimization formulationto compute joint trajectories to execute wiping actions whileguaranteeing constraints satisfaction. We extensively validateour table wiping approach in simulation and on hardware.
View details
Preview abstract
Indirect trajectory optimization methods such as Differential Dynamic Programming (DDP) have found considerable success when only planning under dynamic feasibility constraints. Meanwhile, nonlinear programming (NLP) has been the state-of-the-art approach when faced with additional constraints (e.g., control bounds, obstacle avoidance). However, a na{\"i}ve implementation of NLP algorithms, e.g., shooting-based sequential quadratic programming (SQP), may suffer from slow convergence -- caused from natural instabilities of the underlying system manifesting as poor numerical stability within the optimization. Re-interpreting the DDP closed-loop rollout policy as a \emph{sensitivity-based correction to a second-order search direction}, we demonstrate how to compute analogous closed-loop policies (i.e., feedback gains) for \emph{constrained} problems. Our key theoretical result introduces a novel dynamic programming-based constraint-set recursion that augments the canonical ``cost-to-go" backward pass. On the algorithmic front, we develop a hybrid-SQP algorithm incorporating DDP-style closed-loop rollouts, enabled via efficient \emph{parallelized} computation of the feedback gains. Finally, we validate our theoretical and algorithmic contributions on a set of increasingly challenging benchmarks, demonstrating significant improvements in convergence speed over standard open-loop SQP.
View details
Continuous Control and Multiscale Sensor Fusion with Neural CDEs
Francis Edward McCann Ramirez
Jake Varley
Andy Zeng
IROS & RSS Imitation Learning Workshop (2022)
Preview abstract
Even though robot learning is often formulated in terms of discrete-time Markov decision processes (MDPs), physical robots require near-continuous multiscale feedback control. Machines operate on multiple asynchronous sensing modalities each with different frequencies, e.g., video frames at 30Hz, proprioceptive state at 100Hz, force-torque data at 500Hz, etc. While the classic approach is to batch observations into fixed-time windows then pass them through feed-forward encoders (e.g., with deep networks), we show that there exists a more elegant approach -- one that treats policy learning as modeling latent state dynamics in continuous-time.
Specifically, we present 'InFuser', a unified architecture that trains continuous time-policies with Neural Controlled Differential Equations (CDEs). 'InFuser' evolves a single latent state representation over time by (In)tegrating and (Fus)ing multi-sensory observations (arriving at different frequencies), and inferring actions in continuous-time. This enables policies that can react to multi-frequency multi-sensory feedback for truly end-to-end visuomotor control, without discrete-time assumptions. Behavior cloning experiments demonstrate that 'InFuser' learns robust policies for dynamic tasks (e.g., swinging a ball into a cup) notably outperforming several baselines in settings where observations from one sensing modality can arrive at much sparser intervals than others.
View details
Trajectory Optimization with Optimization-Based Dynamics
Taylor Howell
Simon Le Cleac'h
Pete Florence
Zachary Manchester
ICRA (2022)
Preview abstract
We present a framework for bi-level trajectory optimization in which a system's dynamics are encoded as the solution to a constrained optimization problem and smooth gradients of this lower-level problem are passed to an upper-level trajectory optimizer. This optimization-based dynamics representation enables constraint handling, additional variables, and non-smooth behavior to be abstracted away from the upper-level optimizer, and allows classical unconstrained optimizers to synthesize trajectories for more complex systems. We provide a path-following method for efficient evaluation of constrained dynamics and utilize the implicit-function theorem to compute smooth gradients of this representation. We demonstrate the framework by modeling systems from locomotion, aerospace, and manipulation domains including: acrobot with joint limits, cart-pole subject to Coulomb friction, Raibert hopper, rocket landing with thrust limits, and planar-push task with optimization-based dynamics and then optimize trajectories using iterative LQR.
View details
Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation
Anthony G. Francis
Dmitry Kalashnikov
Edward Lee
Jake Varley
Leila Takayama
Mikael Persson
Peng Xu
Stephen Tu
Xuesu Xiao
Conference on Robot Learning (2022) (to appear)
Preview abstract
Despite decades of research, existing navigation systems still face real-world challenges when being deployed in the wild, e.g., in cluttered home environments or in human-occupied public spaces. To address this, we present a new class of implicit control policies combining the benefits of imitation learning with the robust handling of system constraints of Model Predictive Control (MPC). Our approach, called Performer-MPC, uses a learned cost function parameterized by vision context embeddings provided by Performers---a low-rank implicit-attention Transformer. We jointly train the cost function and construct the controller relying on it, effectively solving end-to-end the corresponding bi-level optimization problem. We show that the resulting policy improves standard MPC performance by leveraging a few expert demonstrations of the desired navigation behavior in different challenging real-world scenarios. Compared with a standard MPC policy, Performer-MPC achieves 40% better goal reached in cluttered environments and 65% better sociability when navigating around humans.
View details
Preview abstract
The recent widespread use of model predictive control (MPC) in safety-critical systems has placed additional emphasis on developing algorithms that have strict performance guarantees despite the presence of model error or external disturbances.
This tutorial summarizes the key theoretical results of combining contraction theory with MPC to enable provably-safe motion planning for robotic and aerospace systems.
The first approach presented establishes the fundamental result that any closed-loop contracting system has an associated state and control input invariant tube which can serve as a safety margin within the motion planning problem.
This result is leveraged in an alternative approach that utilizes neural networks and imitation learning to offloading the computational complexity of online motion planning while maintaining strong safety guarantees.
Finally, current challenges and future research directions, e.g., online model learning, are discussed.
View details