Object-Driven Multi-Layer Scene Decomposition From a Single Image (ICCV 2019)

Helisa Dhamo     Nassir Navab     Federico Tombari    

Technical University of Munich    Google




We present a method that tackles the challenge of predicting color and depth behind the visible content of an image. Our approach aims at building up a Layered Depth Image (LDI) from a single RGB input, which is an efficient representation that arranges the scene in layers, including originally occluded regions. Unlike previous work, we enable an adaptive scheme for the number of layers and incorporate semantic encoding for better hallucination of partly occluded objects. Additionally, our approach is object-driven, which especially boosts the accuracy for the occluded intermediate objects. The framework consists of two steps. First, we individually complete each object in terms of color and depth, while estimating the scene layout. Second, we rebuild the scene based on the regressed layers and enforce the recomposed image to resemble the structure of the original input. The learned representation enables various applications, such as 3D photography and diminished reality, all from a single RGB image.

Paper

International Conference on Computer Vision (ICCV 2019)
Paper PDF | arXiv
  @inproceedings{Dhamo2019iccv,
    title={Object-Driven Multi-Layer Scene Decomposition From a Single Image},
    author={Dhamo, Helisa and Navab, Nassir and Tombari, Federico},
    booktitle={Proceedings IEEE International Conference on Computer Vision (ICCV)},
    year={2019}
  }

Dataset Download

To train our models, we generated large-scale datasets that contain layered scene representations.

Please note: The following dataset is built upon Stanford 2D-3D S. Therefore, by downloading it, you agree to the respective terms of use.

Download: If you would like to download our data, please fill out this Form OMLD Terms of Use.

Code

Rendering code used to generate layered data can be found here.