THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers

Mihai Zanfir; Andrei Zanfir; Eduard Gabriel Bazavan; Bill Freeman; Rahul Sukthankar; Cristian Sminchisescu

THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers

Mihai Zanfir

Andrei Zanfir

Eduard Gabriel Bazavan

Bill Freeman

Rahul Sukthankar

Cristian Sminchisescu

Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

Download Google Scholar

Abstract

We present THUNDR, a transformer-based deep neural network methodology to reconstruct the 3D pose and shape of people, given monocular RGB images. Key to our methodology is an intermediate 3D marker representation, where we aim to combine the predictive power of model-free output architectures and the regularizing, anthropometrically-preserving properties of a statistical human surface models like GHUM—a recently introduced, expressive full body statistical 3d human model, trained end-to-end. Our novel transformer-based prediction pipeline can focus on image regions relevant to the task, supports self-supervised regimes, and ensures that solutions are consistent with human anthropometry. We show state-of-the-art results on Human3.6M and 3DPW, for both the fully-supervised and the self-supervised models, for the task of inferring 3D human shape, joint positions, and global translation. Moreover, we observe very solid 3d reconstruction performance for difficult human poses collected in the wild. Models will be made available for research.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs