Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Lightweight text-driven image editing with disentangled content and attributes

Li, Bo, Lin, Xiao, Liu, Bin, He, Zhi-Fen and Lai, Yu-Kun ORCID: https://orcid.org/0000-0002-2094-5680 2024. Lightweight text-driven image editing with disentangled content and attributes. IEEE Transactions on Multimedia 26 , pp. 1829-1841. 10.1109/TMM.2023.3289755

[thumbnail of TextDrivenImageEditing-TMM.pdf.pdf]
Preview
PDF - Accepted Post-Print Version
Download (27MB) | Preview

Abstract

Text-driven image editing aims to manipulate images with the guidance of natural language description. Text is much more natural and intuitive than many other interaction modes, and attracts more attention recently. However, compared with classical supervised learning tasks, there is no standard benchmark dataset for text-driven interactive image editing up to now. Therefore, it is hard to train an end-to-end model for pixel-aligned interactive image editing driven by text. Some methods follow the paradigm of text-to-image models by incorporating the target image into the process of text-to-image generation. However, these methods relying on cross-modal text-to-image generation involve complicated and expensive models, which can lead to inconsistent editing effects. In this article, a novel text-driven image editing method is proposed. Our key observation is that this task can be more efficiently learned using image-to-image translation. To ensure effective learning for image editing, our framework takes paired text and the corresponding images for training, and disentangles each image into content and attributes, such that the content is maintained while the attributes are modified according to the text. Our network is a lightweight encoder-decoder architecture that accomplishes pixel-aligned end-to-end training via cycle-consistent supervision. Quantitative and qualitative experimental results show that the proposed method achieves state-of-the-art performance.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Engineering
Computer Science & Informatics
Publisher: Institute of Electrical and Electronics Engineers
ISSN: 1520-9210
Date of First Compliant Deposit: 2 August 2023
Date of Acceptance: 8 June 2023
Last Modified: 26 Mar 2024 16:21
URI: https://orca.cardiff.ac.uk/id/eprint/161181

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics