Content-aware Generative Modeling of Graphic Design Layouts - Zheng - SIGGRSPH 2019

Info

Title: Content-aware Generative Modeling of Graphic Design Layouts
Task: Layout Design
Author: X. Zheng, X. Qiao, Y. Cao, and R. W. H. Lau
Date: Jul. 2019
Published: SIGGRSPH 2019
Affiliation: CityU HONG KONG

Highlights & Drawbacks

The first content-aware deep generative model for graphic design layouts, which is able to synthesize diverse graphic design layouts based on visual and textual features.
A large-scale magazine layout dataset with rich semantic annotations including categories, fine-grained se- mantic layouts and keywords summarizing the text contents

Motivation & Design

The dataset A corpus of 3,919 magazine pages from the Internet, covering 6 common categories, including fash- ion, food, news, science, travel and wedding. As these 6 categories of magazine pages cover a large variety of con- tents, they exhibit a rich layout variation. We annotate each page with 6 different semantic elements, including Text, Image, Headline, Text-over-image, Headline-over-image and Background. In addition, we also extract keywords from the text contents of each page to represent the text.

Content-aware Generative Modeling of Graphic Design Layouts - Zheng - SIGGRSPH 2019

The framework of model It has two main parts: a multi-modal embedding network and a layout generative network. The multi-modal embedding network learns the multi-modal features y from three inputs: visual contents (images), textual contents (keywords) and 3 high-level design attributes (design category, text proportion Tp , and image proportion Ip ). These inputs are first sent to 3 independent encoders, i.e., image encoder, text encoder and attribute encoder, respectively, and then merged via a fusion module to obtain $y$ . The layout generative network learns a distribution of layouts conditioned on $y$ and extracts content-aware features $\hat{z}$ , In particular, a layout encoder E maps a layout sample x to features $\hat{z}$ conditioned on $y$ , a layout generator $G$ maps a random vector $z$ to a layout sample $x ̃$ conditioned on $y$ , and a discriminator $D$ learns to distinguish joint pairs $(x, \hat{z})$ and $(x ̃, z)$ conditioned on $y$ .

Loss function The loss of discriminator: Least square GAN loss $L_{G A N}^{D} = \frac{1}{2} (D (x, E (x, y), y) - 1)^{2} + \frac{1}{2} (D (G (z, y), z, y))^{2}$ The loss of generator: Least square GAN loss, reconstruction loss and Kullback-Leibler (KL) divergence loss $L_{G A N}^{G} = \frac{1}{2} (D (G (z, y), z, y) - 1)^{2} \begin{aligned} L_{r e c} & = ‖ x - G (E (x, y), y) ‖^{2} \\ L_{K L} & = D_{K L} (p (\hat{z} | x, y) ‖ q (z)) \end{aligned}$

The loss of encoder: reconstruction loss and Kullback-Leibler (KL) divergence loss

Performance & Ablation Study

Diversity Content-aware Generative Modeling of Graphic Design Layouts - Zheng - SIGGRSPH 2019

Constrained layout generation results In each case, the input con- tents and the input sketch that indicates the approximate positions and sizes of the desired elements in the output layouts (“T ”: Text element, “I ”: Image element, “H”: Headline element, “T\I”: Text-over-image element, “H \I ”: Headline-over-image element) are shown on the left. Results by the baseline (Baseline), our method (Ours), and the ground truth (Ground Truth) are shown on the right, where the Headline is filled with a sequence of A’s in bold. Note that, in each case, the text and image proportions used in both our method and the baseline are obtained from the ground truth layout. Content-aware Generative Modeling of Graphic Design Layouts - Zheng - SIGGRSPH 2019

PREVIOUSRethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff - Blau - ICML 2019

NEXTFrom Classification to Panoptic Segmentation: 7 years of Visual Understanding with Deep Learning

Info

Highlights & Drawbacks

Motivation & Design

Performance & Ablation Study

Related