Conditional Image Generation with PixelCNN Decoders - van den Oord - NIPS 2016 - TensorFlow & PyTorch Code

 

Info

  • Title: Conditional Image Generation with PixelCNN Decoders
  • Task: Image Generation
  • Author: A. van den Oord, N. Kalchbrenner, O. Vinyals, L. Espeholt, A. Graves, and K. Kavukcuoglu
  • Date: Jun. 2016
  • Arxiv: 1606.05328
  • Published: NIPS 2016
  • Affiliation: Google DeepMind

Highlights & Drawbacks

  • Conditional with class labels or conv embeddings
  • Can also serve as a powerful decoder

Motivation & Design

Typically, to make sure the CNN can only use information about pixels above and to the left of the current pixel, the filters of the convolution in PixelCNN are masked. However, its computational cost rise rapidly when stacked.

The gated activation unit: where $σ$ is the sigmoid non-linearity, $k$ is the number of the layer, $⊙$ is the element-wise product and $∗$ is the convolution operator.

Add a high-level image description represented as a latent vector $h$:

Conditional Image Generation with PixelCNN Decoders

Performance & Ablation Study

Class-conditioned Conditional Image Generation with PixelCNN Decoders

Latent Vector(Embedding learned by convolutional networks) Conditional Image Generation with PixelCNN Decoders

Code