Common Datasets for Image Super-Resolution


Image Super Resolution Task

Single image super-resolution (SISR) is a notoriously challenging ill-posed problem, because a specific low-resolution (LR) input can correspond to a crop of possible high-resolution (HR) images, and the HR space (in most instances it refers to the nature image space) that we intend to map the LR input to is usually intractable [5]. Previous methods for SISR mainly have two drawbacks: one is the unclear definition of the mapping that we aim to develop between the LR space and the HR space, and the other is the inefficiency of establishing a complex high-dimensional mapping given massive raw data. Benefiting from the strong capacity of extracting effective high-level abstractions which bridge the LR space and HR space, recent DL-based SISR methods have achieved significant improvements, both quantitatively and qualitatively.


Name Datasets Short Description Google Drive Baidu Drive
Classical SR Training T91 91 images for training Google Drive Baidu Drive
BSDS200 A subset (train) of BSD500 for training
General100 100 images for training
Classical SR Testing Set5 Set5 test dataset
Set14 Set14 test dataset
BSDS100 A subset (test) of BSD500 for testing
urban100 100 building images for testing (regular structures)
manga109 109 images of Japanese manga for testing
historical 10 gray LR images without the ground-truth
2K Resolution DIV2K proposed in NTIRE17 (800 train and 100 validation) Google Drive Baidu Drive
Flickr2K 2650 2K images from Flickr for training
DF2K A merged training dataset of DIV2K and Flickr2K
OST (Outdoor Scenes) OST Training 7 categories images with rich textures Google Drive Baidu Drive
OST300 300 test images of outdoor scences
PIRM PIRM PIRM self-val, val, test datasets Google Drive Baidu Drive

The above downloading links are provided by mmsr.

The Set5, Set14, BSDS100, urban100 datasets can be found at the project page of LapSRN.

The annual PIMR Challenge(dataset, (2018 ECCV Workshop)is a leading benchmark for image super-resolution models.


Data overview

The DIV2K dataset is divided into:

  • train data: starting from 800 high definition high resolution images we obtain corresponding low resolution images and provide both high and low resolution images for 2, 3, and 4 downscaling factors
  • validation data: 100 high definition high resolution images are used for genereting low resolution corresponding images, the low res are provided from the beginning of the challenge and are meant for the participants to get online feedback from the validation server; the high resolution images will be released when the final phase of the challenge starts.
  • test data: 100 diverse images are used to generate low resolution corresponding images; the participants will receive the low resolution images when the final evaluation phase starts and the results will be announced after the challenge is over and the winners are decided.