SNIPER: efficient multi-scale training - Singh - NIPS 2018 - MXNet Code



  • Title: SNIPER: efficient multi-scale training
  • Task: Object Detection
  • Author: B. Singh, M. Najibi, and L. S. Davis
  • Date: May 2018
  • Arxiv: 1805.09300
  • Published: NIPS 2018

Highlights & Drawbacks

  • Efficient version of SNIP training strategy for object detection
  • Select ROIs with proper size only inside a batch

Motivation & Design

SNIPER: efficient multi-scale training

Following SNIP, the authors put crops of an image which contain objects to be detected(called chips) into training instead of the entire image. This design also makes large-batch training possible, which accelerates the training process. This training method utilizes the context of the object, which can save unnecessary calculations for simple background(such as the sky) so that the utilization rate of training data is improved.

SNIPER: efficient multi-scale training

The core design of SNIPER is the selection strategy for ROIs from a chip(a crop of entire image). The authors use several hyper-params to filter boxes with proper size in a batch, hopping that the detector network only learns features beyond object size.

Due to its memory efficient design, SNIPER can benefit from Batch Normalization during training and it makes larger batch-sizes possible for instance-level recognition tasks on a single GPU. Hence, there is no need to synchronize batch-normalization statistics across GPUs.

Performance & Ablation Study

An improvement of the accuracy of small-size objects was reported according to the author’s experiments.

SNIPER: efficient multi-scale training