Complete Instances Mining for Weakly Supervised Instance Segmentation
Weakly supervised instance segmentation (WSIS) using only image-level labels is a challenging task due to the difficulty of aligning coarse annotations with the finer task. However, with the advancement of deep neural networks (DNNs), WSIS has garnered significant attention. Following a proposal-based paradigm, we encounter a redundant segmentation problem resulting from a single instance being represented by multiple proposals. For example, we feed a picture of a dog and proposals into the network and expect to output only one proposal containing a dog, but the network outputs multiple proposals. To address this problem, we propose a novel approach for WSIS that focuses on the online refinement of complete instances through the use of MaskIoU heads to predict the integrity scores of proposals and a Complete Instances Mining (CIM) strategy to explicitly model the redundant segmentation problem and generate refined pseudo labels. Our approach allows the network to become aware of multiple instances and complete instances, and we further improve its robustness through the incorporation of an Anti-noise strategy. Empirical evaluations on the PASCAL VOC 2012 and MS COCO datasets demonstrate that our method achieves state-of-the-art performance with a notable margin. Our implementation will be made available at https://github.com/ZechengLi19/CIM.
PDF Abstract International Joint 2023 PDF International Joint 2023 AbstractCode
Datasets
Results from the Paper
Ranked #1 on Image-level Supervised Instance Segmentation on COCO 2017 val (using extra training data)
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Uses Extra Training Data |
Benchmark |
---|---|---|---|---|---|---|---|
Image-level Supervised Instance Segmentation | COCO 2017 val | CIM + Mask R-CNN | AP | 17.0 | # 1 | ||
AP@50 | 29.4 | # 1 | |||||
AP@75 | 17.0 | # 1 | |||||
Image-level Supervised Instance Segmentation | COCO 2017 val | CIM | AP | 11.9 | # 3 | ||
AP@50 | 22.8 | # 3 | |||||
AP@75 | 11.1 | # 3 | |||||
Image-level Supervised Instance Segmentation | COCO test-dev | CIM | AP | 12.0 | # 5 | ||
AP@50 | 23.0 | # 5 | |||||
AP@75 | 11.3 | # 5 | |||||
Image-level Supervised Instance Segmentation | COCO test-dev | CIM + Mask R-CNN | AP | 17.2 | # 1 | ||
AP@50 | 29.7 | # 1 | |||||
AP@75 | 17.3 | # 1 | |||||
Image-level Supervised Instance Segmentation | PASCAL VOC 2012 val | CIM | mAP@0.5 | 51.1 | # 2 | ||
mAP@0.25 | 64.9 | # 2 | |||||
mAP@0.7 | 32.4 | # 2 | |||||
mAP@0.75 | 26.1 | # 4 | |||||
Weakly-supervised instance segmentation | PASCAL VOC 2012 val | CIM + Mask R-CNN | mAP@0.25 | 68.7 | # 2 | ||
mAP@0.5 | 55.9 | # 3 | |||||
mAP@0.75 | 30.9 | # 2 | |||||
Image-level Supervised Instance Segmentation | PASCAL VOC 2012 val | CIM + Mask R-CNN | mAP@0.5 | 55.9 | # 1 | ||
mAP@0.25 | 68.7 | # 1 | |||||
mAP@0.7 | 37.1 | # 1 | |||||
mAP@0.75 | 30.9 | # 1 | |||||
Point-Supervised Instance Segmentation | PASCAL VOC 2012 val | CIM + Mask R-CNN | mAP@0.5 | 55.5 | # 2 | ||
mAP@0.25 | 67.8 | # 1 | |||||
mAP@0.7 | 36.6 | # 1 | |||||
mAP@0.75 | 31.1 | # 1 |