Skip to main navigation Skip to search Skip to main content

PRISM: Progressive Restoration for Scene Graph-Based Image Manipulation

  • Pavel Jahoda
  • , Yousef Yeganeh
  • , Ehsan Adeli
  • , Nassir Navab
  • , Azade Farshad
  • Technical University of Munich
  • Munich Center for Machine Learning
  • Stanford University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Scene graphs have emerged as accurate semantic descriptions for image generation and manipulation tasks; however, their complexity and diversity of the shapes and relations of objects in data make it challenging to incorporate them into the models and generate high-quality results. To address these challenges, we propose PRISM, a novel progressive multi-head image manipulation approach to improve the accuracy of the manipulation of target regions in the scene. Our image manipulation framework is trained using an end-to-end denoising masked reconstruction proxy task, where the masked regions are progressively unmasked from the outer regions to the inner part. We take advantage of the outer part of the masked area as they have a direct correlation with the context of the scene. Moreover, our multi-head architecture simultaneously generates detailed object-specific regions in addition to the entire image to produce higher-quality images. Our model is evaluated against methods in the semantic image manipulation task on the CLEVR and Visual Genome datasets. Our results demonstrate the potential of our approach for enhancing the quality and precision of scene graph-based image manipulation.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2024 Workshops, Proceedings
EditorsAlessio Del Bue, Cristian Canton, Jordi Pont-Tuset, Tatiana Tommasi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages142-160
Number of pages19
ISBN (Print)9783031918377
DOIs
StatePublished - 2025
EventWorkshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy
Duration: 29 Sep 20244 Oct 2024

Publication series

NameLecture Notes in Computer Science
Volume15631 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceWorkshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024
Country/TerritoryItaly
CityMilan
Period29/09/244/10/24

Fingerprint

Dive into the research topics of 'PRISM: Progressive Restoration for Scene Graph-Based Image Manipulation'. Together they form a unique fingerprint.

Cite this