Using layer information to improve control of image generation models

Placeholder Show Content

Abstract/Contents

Abstract
While generative models can produce photorealistic images resembling a prompt, text is an ambiguous medium to describe an artist's intent. Specifically, text is an imprecise way to describe complex scene compositions: it is difficult to describe where objects are spatially and how they should appear visually. Inspired by techniques from graphics systems, we introduce the abstraction of a layer to image generation models and explore how to leverage layer-based information to enable greater control over the generative process. We allow users to define output targets by expressing the spatial layout of layers and specify per-layer constraints that control the layer's appearance. To enable artists to create layer-based compositions, we provide a functioning implementation of an easy-to-use and highly performant web interface that enables users to iteratively generate complex images with diffusion models. We demonstrate through a user study that layer-based information allows novice users of the technology to quickly arrive at images to their liking.

Description

Type of resource text
Date created June 9, 2023
Publication date June 22, 2023; June 9, 2023

Creators/Contributors

Author Li, Linden
Degree granting institution Stanford University, Department of Computer Science
Advisor Fatahalian, Kayvon
Advisor Ré, Christopher

Subjects

Subject Artificial intelligence
Subject Generative AI
Genre Text
Genre Thesis

Bibliographic information

Access conditions

Use and reproduction
User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
License
This work is licensed under a Creative Commons Attribution Non Commercial 4.0 International license (CC BY-NC).

Preferred citation

Preferred citation
Li, L. and Stanford University, Department of Computer Science (2023). Using layer information to improve control of image generation models. Stanford Digital Repository. Available at https://purl.stanford.edu/tm287xk2802. https://doi.org/10.25740/tm287xk2802.

Collection

Undergraduate Theses, School of Engineering

View other items in this collection in SearchWorks

Contact information

Also listed in

Loading usage metrics...