Unsupervised Training for 3D Morphable Model Regression
Abstract
We present a method for training a regression network from image pixels to 3D morphable model coordinates using only unlabeled photographs. The training loss is based on features from a facial recognition network, computed on-the-fly by rendering the predicted faces with a differentiable renderer. To make training from features feasible and avoid network fooling effects, we introduce three objectives: a batch regularization loss that encourages the output distribution to match the distribution of the morphable model, a loopback loss that ensures the regression network can correctly reinterpret its own output, and a multi-view loss that compares the predicted 3D face to the input photograph from multiple viewing angles. We train a regression network using these objectives, a set of unlabeled photographs, and the morphable model itself, and demonstrate state-of-the-art results.