TY - JOUR
T1 - Peeking behind objects
T2 - Layered depth prediction from a single image
AU - Dhamo, Helisa
AU - Tateno, Keisuke
AU - Laina, Iro
AU - Navab, Nassir
AU - Tombari, Federico
N1 - Publisher Copyright:
© 2019 Elsevier B.V.
PY - 2019/7/1
Y1 - 2019/7/1
N2 - While conventional depth estimation can infer the geometry of a scene from a single RGB image, it fails to estimate scene regions that are occluded by foreground objects. This limits the use of depth prediction in augmented and virtual reality applications, that aim at scene exploration by synthesizing the scene from a different vantage point, or at diminished reality. To address this issue, we shift the focus from conventional depth map prediction to the regression of a specific data representation called Layered Depth Image (LDI), which contains information about the occluded regions in the reference frame and can fill in occlusion gaps in case of small view changes. We propose a novel approach based on Convolutional Neural Networks (CNNs) to jointly predict depth maps and foreground separation masks used to condition Generative Adversarial Networks (GANs) for hallucinating plausible color and depths in the initially occluded areas. We demonstrate the effectiveness of our approach for novel scene view synthesis from a single image.
AB - While conventional depth estimation can infer the geometry of a scene from a single RGB image, it fails to estimate scene regions that are occluded by foreground objects. This limits the use of depth prediction in augmented and virtual reality applications, that aim at scene exploration by synthesizing the scene from a different vantage point, or at diminished reality. To address this issue, we shift the focus from conventional depth map prediction to the regression of a specific data representation called Layered Depth Image (LDI), which contains information about the occluded regions in the reference frame and can fill in occlusion gaps in case of small view changes. We propose a novel approach based on Convolutional Neural Networks (CNNs) to jointly predict depth maps and foreground separation masks used to condition Generative Adversarial Networks (GANs) for hallucinating plausible color and depths in the initially occluded areas. We demonstrate the effectiveness of our approach for novel scene view synthesis from a single image.
KW - Generative adversarial networks
KW - Layered depth image
KW - Occlusion
KW - RGB-D inpainting
UR - http://www.scopus.com/inward/record.url?scp=85065923200&partnerID=8YFLogxK
U2 - 10.1016/j.patrec.2019.05.007
DO - 10.1016/j.patrec.2019.05.007
M3 - Article
AN - SCOPUS:85065923200
SN - 0167-8655
VL - 125
SP - 333
EP - 340
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -