TY - GEN
T1 - Face2Face
T2 - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
AU - Thies, Justus
AU - Zollhofer, Michael
AU - Stamminger, Marc
AU - Theobalt, Christian
AU - Niebner, Matthias
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/12/9
Y1 - 2016/12/9
N2 - We present a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source sequence is also a monocular video stream, captured live with a commodity webcam. Our goal is to animate the facial expressions of the target video by a source actor and re-render the manipulated output video in a photo-realistic fashion. To this end, we first address the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling. At run time, we track facial expressions of both source and target video using a dense photometric consistency measure. Reenactment is then achieved by fast and efficient deformation transfer between source and target. The mouth interior that best matches the re-targeted expression is retrieved from the target sequence and warped to produce an accurate fit. Finally, we convincingly re-render the synthesized target face on top of the corresponding video stream such that it seamlessly blends with the real-world illumination. We demonstrate our method in a live setup, where Youtube videos are reenacted in real time.
AB - We present a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source sequence is also a monocular video stream, captured live with a commodity webcam. Our goal is to animate the facial expressions of the target video by a source actor and re-render the manipulated output video in a photo-realistic fashion. To this end, we first address the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling. At run time, we track facial expressions of both source and target video using a dense photometric consistency measure. Reenactment is then achieved by fast and efficient deformation transfer between source and target. The mouth interior that best matches the re-targeted expression is retrieved from the target sequence and warped to produce an accurate fit. Finally, we convincingly re-render the synthesized target face on top of the corresponding video stream such that it seamlessly blends with the real-world illumination. We demonstrate our method in a live setup, where Youtube videos are reenacted in real time.
UR - http://www.scopus.com/inward/record.url?scp=84986308411&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2016.262
DO - 10.1109/CVPR.2016.262
M3 - Conference contribution
AN - SCOPUS:84986308411
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 2387
EP - 2395
BT - Proceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
PB - IEEE Computer Society
Y2 - 26 June 2016 through 1 July 2016
ER -