🔔 Share your dataset with the ML community!

Filter by Modality (clear)

Filter by Task (clear)

Filter by Language

15 dataset results for Image Classification AND Medical

ChestX-ray8 is a medical imaging dataset which comprises 108,948 frontal-view X-ray images of 32,717 (collected from the year of 1992 to 2015) unique patients with the text-mined eight common disease labels, mined from the text radiological reports via NLP techniques.

78 PAPERS • NO BENCHMARKS YET

Chaoyang

Chaoyang dataset contains 1111 normal, 842 serrated, 1404 adenocarcinoma, 664 adenoma, and 705 normal, 321 serrated, 840 adenocarcinoma, 273 adenoma samples for training and testing, respectively. This noisy dataset is constructed in the real scenario.

13 PAPERS • 2 BENCHMARKS

Hyper-Kvasir Dataset

HyperKvasir dataset contains 110,079 images and 374 videos where it captures anatomical landmarks and pathological and normal findings. A total of around 1 million images and video frames altogether.

11 PAPERS • 2 BENCHMARKS

BCN_20000

BCN_20000 is a dataset composed of 19,424 dermoscopic images of skin lesions captured from 2010 to 2016 in the facilities of the Hospital Clínic in Barcelona. The dataset can be used for lesion recognition tasks such as lesion segmentation, lesion detection and lesion classification.

10 PAPERS • NO BENCHMARKS YET

BreakHis (Breast Cancer Histopathological Database)

The Breast Cancer Histopathological Image Classification (BreakHis) is composed of 9,109 microscopic images of breast tumor tissue collected from 82 patients using different magnifying factors (40X, 100X, 200X, and 400X). It contains 2,480 benign and 5,429 malignant samples (700X460 pixels, 3-channel RGB, 8-bit depth in each channel, PNG format). This database has been built in collaboration with the P&D Laboratory - Pathological Anatomy and Cytopathology, Parana, Brazil.

9 PAPERS • 5 BENCHMARKS

DFUC2021

DFUC2021 (Diabetic Foot Ulcers 2021)

The Diabetic Foot Ulcers dataset (DFUC2021) is a dataset for analysis of pathology, focusing on infection and ischaemia. The final release of DFUC2021 consists of 15,683 DFU patches, with 5,955 training, 5,734 for testing and 3,994 unlabeled DFU patches. The ground truth labels are four classes, i.e. control, infection, ischaemia and both conditions.

7 PAPERS • NO BENCHMARKS YET

DiagSet

DiagSet is a histopathological dataset for prostate cancer detection. The proposed dataset consists of over 2.6 million tissue patches extracted from 430 fully annotated scans, 4675 scans with assigned binary diagnosis, and 46 scans with diagnosis given independently by a group of histopathologists.

4 PAPERS • NO BENCHMARKS YET

LIMUC (Labeled Images for Ulcerative Colitis)

The LIMUC dataset is the largest publicly available labeled ulcerative colitis dataset that compromises 11276 images from 564 patients and 1043 colonoscopy procedures. Three experienced gastroenterologists were involved in the annotation process, and all images are labeled according to the Mayo endoscopic score (MES).

4 PAPERS • 1 BENCHMARK

LKS (Liver Kidney Stomach)

LKS is a dataset of 684 Liver-Kidney-Stomach immunofluorescence whole slide images (WSIs) used in the investigation of autoimmune liver disease.

3 PAPERS • NO BENCHMARKS YET

MIMIC-CXR-LT (long-tailed version of MIMIC-CXR)

MIMIC-CXR-LT. We construct a single-label, long-tailed version of MIMIC-CXR in a similar manner. MIMIC-CXR is a multi-label classification dataset with over 200,000 chest X-rays labeled with 13 pathologies and a “No Findings” class. The resulting MIMIC-CXR-LT dataset contains 19 classes, of which 10 are head classes, 6 are medium classes, and 3 are tail classes. MIMIC-CXR-LT contains 111,792 images labeled with one of 18 diseases, with 87,493 training images and 23,550 test set images. The validation and balanced test sets contain 15 and 30 images per class, respectively.

3 PAPERS • 1 BENCHMARK

NIH-CXR-LT (Long-tailed (LT) NIH ChestXRay14)

NIH-CXR-LT. NIH ChestXRay14 contains over 100,000 chest X-rays labeled with 14 pathologies, plus a “No Findings” class. We construct a single-label, long-tailed version of the NIH ChestXRay14 dataset by introducing five new disease findings described above. The resulting NIH-CXR-LT dataset has 20 classes, including 7 head classes, 10 medium classes, and 3 tail classes. NIH-CXR-LT contains 88,637 images labeled with one of 19 thorax diseases, with 68,058 training and 20,279 test images. The validation and balanced test sets contain 15 and 30 images per class, respectively.

3 PAPERS • 1 BENCHMARK

Endotect Polyp Segmentation Challenge Dataset

A challenge that consists of three tasks, each targeting a different requirement for in-clinic use. The first task involves classifying images from the GI tract into 23 distinct classes. The second task focuses on efficiant classification measured by the amount of time spent processing each image. The last task relates to automatcially segmenting polyps.

2 PAPERS • 1 BENCHMARK

Kvasir-Capsule

Kvasir-Capsule dataset is the largest publicly released VCE dataset. In total, the dataset contains 47,238 labeled images and 117 videos, where it captures anatomical landmarks and pathological and normal findings. The results is more than 4,741,621 images and video frames altogether.

2 PAPERS • NO BENCHMARKS YET

Cervix93 Cytology Dataset

The dataset has 93 image stacks and their corresponding Extended Depth of Field (EDF) image acquired from cases with grades Nagative, LSIL or HSIL (The Bethesda System): - Negative: 16 - LSIL: 46 - HSIL: 31 The ground truth includes the grade labels for each frame and manually marked points inside cervical cells in each frame. There are in total 2705 manually marked points inside all frames: - Negative: 238 - LSIL: 1536 - HSIL: 931

1 PAPER • NO BENCHMARKS YET

Synthetic COVID-19 CXR Dataset

A public open dataset of synthetic chest X-ray images of COVID-19.

1 PAPER • NO BENCHMARKS YET

Datasets

15 dataset results for Image Classification AND Medical