Optimizing Trajectories with Closed-Loop Dynamic SQP

Sumeet Singh; Jean-Jacques Slotine; Vikas Sindhwani

Optimizing Trajectories with Closed-Loop Dynamic SQP

Sumeet Singh

Jean-Jacques Slotine

Vikas Sindhwani

ICRA (2022)

Google Scholar

Abstract

Indirect trajectory optimization methods such as Differential Dynamic Programming (DDP) have found considerable success when only planning under dynamic feasibility constraints. Meanwhile, nonlinear programming (NLP) has been the state-of-the-art approach when faced with additional constraints (e.g., control bounds, obstacle avoidance). However, a na{\"i}ve implementation of NLP algorithms, e.g., shooting-based sequential quadratic programming (SQP), may suffer from slow convergence -- caused from natural instabilities of the underlying system manifesting as poor numerical stability within the optimization. Re-interpreting the DDP closed-loop rollout policy as a \emph{sensitivity-based correction to a second-order search direction}, we demonstrate how to compute analogous closed-loop policies (i.e., feedback gains) for \emph{constrained} problems. Our key theoretical result introduces a novel dynamic programming-based constraint-set recursion that augments the canonical ``cost-to-go" backward pass. On the algorithmic front, we develop a hybrid-SQP algorithm incorporating DDP-style closed-loop rollouts, enabled via efficient \emph{parallelized} computation of the feedback gains. Finally, we validate our theoretical and algorithmic contributions on a set of increasingly challenging benchmarks, demonstrating significant improvements in convergence speed over standard open-loop SQP.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Optimizing Trajectories with Closed-Loop Dynamic SQP

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs