The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.
10,363 PAPERS • 93 BENCHMARKS
The Flickr30k dataset contains 31,000 images collected from Flickr, together with 5 reference sentences provided by human annotators.
754 PAPERS • 9 BENCHMARKS
transform the ImageNet-1K classification datatset for Chinese models by translating labels and prompts into Chinese.
3 PAPERS • 1 BENCHMARK
A Zero-Shot Sketch-based Inter-Modal Object Retrieval Scheme for Remote Sensing Images
1 PAPER • NO BENCHMARKS YET