Info
- Title: Generative Image Inpainting with Contextual Attention
- Task: Image Inpainting
- Author: J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang
- Date: Jan. 2018
- Arxiv: 1801.07892
- Published: CVPR 2018
- Affiliation: UIUC & Adobe
Highlights & Drawbacks
- A novel contextual attention layer to explicitly attend on related feature patches at distant spatial locations.
- Introduce spatially discounted reconstruction loss to improve the training stability and speed based on the current the state-of-the-art generative image inpainting network
Motivation & Design
Overview of our improved generative inpainting framework. The coarse network is trained with reconstruction loss explicitly, while the refinement network is trained with reconstruction loss, global and local WGAN-GP adversarial loss.
Illustration of the contextual attention layer. Firstly we use convolution to compute matching score of foreground patches with background patches (as convolu- tional filters). Then we apply softmax to compare and get attention score for each pixel. Finally we reconstruct fore- ground patches with background patches by performing de- convolution on attention score. The contextual attention layer is differentiable and fully-convolutional.
Based on coarse result from the first encoder- decoder network, two parallel encoders are introduced and then merged to single decoder to get inpainting result. For visualization of attention map, color indicates relative loca- tion of the most interested background patch for each pixel in foreground. For examples, white (center of color coding map) means the pixel attends on itself, pink on bottom-left, green means on top-right.
Training Procedure
Performance & Ablation Study
Based on coarse result from the first encoder- decoder network, two parallel encoders are introduced and then merged to single decoder to get inpainting result. For visualization of attention map, color indicates relative loca- tion of the most interested background patch for each pixel in foreground. For examples, white (center of color coding map) means the pixel attends on itself, pink on bottom-left, green means on top-right.