- Title: SNIPER: efficient multi-scale training
- Task: Object Detection
- Author: B. Singh, M. Najibi, and L. S. Davis
- Date: May 2018
- Arxiv: 1805.09300
- Published: NIPS 2018
Highlights & Drawbacks
- Efficient version of SNIP training strategy for object detection
- Select ROIs with proper size only inside a batch
Motivation & Design
Following SNIP, the authors put crops of an image which contain objects to be detected(called chips) into training instead of the entire image. This design also makes large-batch training possible, which accelerates the training process. This training method utilizes the context of the object, which can save unnecessary calculations for simple background(such as the sky) so that the utilization rate of training data is improved.
The core design of SNIPER is the selection strategy for ROIs from a chip(a crop of entire image). The authors use several hyper-params to filter boxes with proper size in a batch, hopping that the detector network only learns features beyond object size.
Due to its memory efficient design, SNIPER can benefit from Batch Normalization during training and it makes larger batch-sizes possible for instance-level recognition tasks on a single GPU. Hence, there is no need to synchronize batch-normalization statistics across GPUs.
Performance & Ablation Study
An improvement of the accuracy of small-size objects was reported according to the author’s experiments.
- You Only Look Once: Unified, Real Time Object Detection - Redmon et al. - CVPR 2016
- An analysis of scale invariance in object detection - SNIP - Singh - CVPR 2018
- Faster R-CNN: Towards Real Time Object Detection with Region Proposal - Ren - NIPS 2015
- (FPN)Feature Pyramid Networks for Object Detection - Lin - CVPR 2017
- (RetinaNet)Focal loss for dense object detection - Lin - ICCV 2017
- YOLO9000: Better, Faster, Stronger - Redmon et al. - 2016