The ImageNet dataset contains 14,197,122 annotated images according to the WordNet hierarchy. Since 2010 the dataset is used in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), a benchmark in image classification and object detection. The publicly released dataset contains a set of manually annotated training images. A set of test images is also released, with the manual annotations withheld. ILSVRC annotations fall into one of two categories: (1) image-level annotation of a binary label for the presence or absence of an object class in the image, e.g., “there are cars in this image” but “there are no tigers,” and (2) object-level annotation of a tight bounding box and class label around an object instance in the image, e.g., “there is a screwdriver centered at position (20,25) with width of 50 pixels and height of 30 pixels”. The ImageNet project does not own the copyright of the images, therefore only thumbnails and URLs of images are provided.
13,671 PAPERS • 41 BENCHMARKS
CelebFaces Attributes dataset contains 202,599 face images of the size 178×218 from 10,177 celebrities, each annotated with 40 binary labels indicating facial attributes like hair color, gender and age.
3,124 PAPERS • 20 BENCHMARKS
The GoPro dataset for deblurring consists of 3,214 blurred images with the size of 1,280×720 that are divided into 2,103 training images and 1,111 test images. The dataset consists of pairs of a realistic blurry image and the corresponding ground truth shapr image that are obtained by a high-speed camera.
314 PAPERS • 3 BENCHMARKS
Consists of 8,422 blurry and sharp image pairs with 65,784 densely annotated FG human bounding boxes.
71 PAPERS • 4 BENCHMARKS
This dataset includes sharp-blur pairs of Leishmania image, which is a protozoan parasite microscopy image dataset of Leishmania, obtained from the preserved slides stained with Giemsa. The paired blur-sharp images are acquired by employing a bright-field microscope (Olympus IX53) with 100× magnification oil immersion objectives.We first capture the sharp images as ground truth, then acquire its corresponding out-of-focus images. The extent and nature of defocusing are random along the optical axis, where the degree of out-of-focus is inconsistent from image-to-image. This dataset includes 764 in-focus and 764 corresponding out-of-focus images, where each image is composed of 2304 × 1728 pixels in 24-bit JPG format.
1 PAPER • NO BENCHMARKS YET