SNIPER: efficient multi-scale training - Singh - NIPS 2018 - MXNet Code

Info

Title: SNIPER: efficient multi-scale training
Task: Object Detection
Author: B. Singh, M. Najibi, and L. S. Davis
Date: May 2018
Arxiv: 1805.09300
Published: NIPS 2018

Highlights & Drawbacks

Efficient version of SNIP training strategy for object detection
Select ROIs with proper size only inside a batch

Motivation & Design

SNIPER: efficient multi-scale training

Following SNIP, the authors put crops of an image which contain objects to be detected(called chips) into training instead of the entire image. This design also makes large-batch training possible, which accelerates the training process. This training method utilizes the context of the object, which can save unnecessary calculations for simple background(such as the sky) so that the utilization rate of training data is improved.

SNIPER: efficient multi-scale training

The core design of SNIPER is the selection strategy for ROIs from a chip(a crop of entire image). The authors use several hyper-params to filter boxes with proper size in a batch, hopping that the detector network only learns features beyond object size.

Due to its memory efficient design, SNIPER can benefit from Batch Normalization during training and it makes larger batch-sizes possible for instance-level recognition tasks on a single GPU. Hence, there is no need to synchronize batch-normalization statistics across GPUs.