Xception: Deep Learning with Depthwise Seperable Convolutions - Chollet et al. - 2016

 

Info

  • Title: Xception: Deep Learning with Depthwise Seperable Convolutions
  • Author: F. Chollet
  • Arxiv: 1610.02357
  • Date: Oct. 2016

Highlights & Drawbacks

Replaced 1×1 convolution and 3×3 convolution in Inception unit with Depth-wise seperable convolution

Motivation & Design

The article points out that the assumption behind the Inception unit is that the correlation between the channel and the space can be fully decoupled, similarly the convolution structure in the length and height directions (the 3 × 3 convolution in Inception-v3 is 1 × 3 and 3 × 1 convolution replacement).

Further, Xception is based on a stronger assumption: the correlation between channels and cross-space is completely decoupled. This is also the concept modeled by Depthwise Separable Convolution. A simple Inception Module:

Xception: Deep Learning with Depthwise Seperable Convolutions

is equal to:

Xception: Deep Learning with Depthwise Seperable Convolutions

Push # of channel to extreme, we obtain Depthwise Separable Convolution:

Xception: Deep Learning with Depthwise Seperable Convolutions

NetScope Visualization and source code: awesome_cnn.

Xception: Deep Learning with Depthwise Seperable Convolutions