Human Appearance Transfer
Abstract
We propose an automatic person-to-person appearance
transfer model based on explicit parametric 3d human representations
and learned, constrained deep translation network
architectures for photographic image synthesis. Given
a single source image and a single target image, each
corresponding to different human subjects, wearing different
clothing and in different poses, our goal is to photorealistically
transfer the appearance from the source image
onto the target image while preserving the target shape
and clothing segmentation layout. Our solution to this new
problem is formulated in terms of a computational pipeline
that combines (1) 3d human pose and body shape estimation
from monocular images, (2) identifying 3d surface colors elements
(mesh triangles) visible in both images, that can be
transferred directly using barycentric procedures, and (3)
predicting surface appearance missing in the first image but
visible in the second one using deep learning-based image
synthesis techniques. Our model achieves promising results
as supported by a perceptual user study where the participants
rated around 65% of our results as good, very good
or perfect, as well in automated tests (Inception scores and
a Faster-RCNN human detector responding very similarly
to real and model generated images). We further show how
the proposed architecture can be profiled to automatically
generate images of a person dressed with different clothing
transferred from a person in another image, opening
paths for applications in entertainment and photo-editing
(e.g. embodying and posing as friends or famous actors),
the fashion industry, or affordable online shopping of clothing.