Search Results for author: Yabiao Wang

Found 77 papers, 43 papers with code

ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection

1 code implementation • 5 Jun 2024 • Jiangning Zhang, Haoyang He, Zhenye Gan, Qingdong He, Yuxuan Cai, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong liu

This paper addresses this issue by proposing a comprehensive visual anomaly detection benchmark, \textbf{\textit{ADer}}, which is a modular framework that is highly extensible for new methods.

Anomaly Detection Lesion Detection

Paper
Code

AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval

no code implementations • 28 May 2024 • Sihe Zhang, Qingdong He, Jinlong Peng, Yuxi Li, Zhengkai Jiang, Jiafu Wu, Mingmin Chi, Yabiao Wang, Chengjie Wang

To mitigate this issue, we introduce a novel setting for low-quality image retrieval, and propose an Adaptive Noise-Based Network (AdapNet) to learn robust abstract representations.

Image Retrieval Re-Ranking +1

Paper
Add Code

VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation

no code implementations • 28 May 2024 • Qilin Wang, Zhengkai Jiang, Chengming Xu, Jiangning Zhang, Yabiao Wang, Xinyi Zhang, Yun Cao, Weijian Cao, Chengjie Wang, Yanwei Fu

This enables accurate alignment of pose and shape in the generated videos, providing a robust framework capable of handling a wide range of body shapes and dynamic hand movements.

Image Animation

Paper
Add Code

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

no code implementations • 24 May 2024 • Ke Fan, Junshu Tang, Weijian Cao, Ran Yi, Moran Li, Jingyu Gong, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Lizhuang Ma

Text-to-motion synthesis is a crucial task in computer vision.

Motion Synthesis

Paper
Add Code

Open-Vocabulary SAM3D: Understand Any 3D Scene

no code implementations • 24 May 2024 • Hanchen Tai, Qingdong He, Jiangning Zhang, Yijie Qian, Zhenyu Zhang, Xiaobin Hu, Yabiao Wang, Yong liu

In this paper, we introduce OV-SAM3D, a universal framework for open-vocabulary 3D scene understanding.

Scene Understanding Zero Shot Segmentation

Paper
Add Code

PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning

no code implementations • 24 May 2024 • Qingdong He, Jiangning Zhang, Jinlong Peng, Haoyang He, Yabiao Wang, Chengjie Wang

Transformers have revolutionized the point cloud learning task, but the quadratic complexity hinders its extension to long sequence and makes a burden on limited computational resources.

Paper
Add Code

Efficient Multimodal Large Language Models: A Survey

1 code implementation • 17 May 2024 • Yizhang Jin, Jian Li, Yexin Liu, Tianjun Gu, Kai Wu, Zhengkai Jiang, Muyang He, Bo Zhao, Xin Tan, Zhenye Gan, Yabiao Wang, Chengjie Wang, Lizhuang Ma

In the past year, Multimodal Large Language Models (MLLMs) have demonstrated remarkable performance in tasks such as visual question answering, visual understanding and reasoning.

Edge-computing Question Answering +1

135

Paper
Code

MotionMaster: Training-free Camera Motion Transfer For Video Generation

no code implementations • 24 Apr 2024 • Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma

Furthermore, we propose a few-shot camera motion disentanglement method to extract the common camera motion from multiple videos with similar camera motions, which employs a window-based clustering technique to extract the common features in temporal attention maps of multiple videos.

Disentanglement Motion Disentanglement +2

Paper
Add Code

Single-temporal Supervised Remote Change Detection for Domain Generalization

no code implementations • 17 Apr 2024 • Qiangang Du, Jinlong Peng, Xu Chen, Qingdong He, Liren He, Qiang Nie, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

In this paper, we propose a multimodal contrastive learning (ChangeCLIP) based on visual-language pre-training for change detection domain generalization.

Change Detection Contrastive Learning +1

Paper
Add Code

Leveraging Fine-Grained Information and Noise Decoupling for Remote Sensing Change Detection

no code implementations • 17 Apr 2024 • Qiangang Du, Jinlong Peng, Changan Wang, Xu Chen, Qingdong He, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

Next, a shape-aware and a brightness-aware module are designed to improve the capacity for representation learning.

Change Detection Denoising +1

Paper
Add Code

DMAD: Dual Memory Bank for Real-World Anomaly Detection

no code implementations • 19 Mar 2024 • Jianlong Hu, Xu Chen, Zhenye Gan, Jinlong Peng, Shengchuan Zhang, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Liujuan Cao, Rongrong Ji

To address the challenge of real-world anomaly detection, we propose a new framework named Dual Memory bank enhanced representation learning for Anomaly Detection (DMAD).

Anomaly Detection Representation Learning

Paper
Add Code

Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection

no code implementations • 18 Mar 2024 • Liren He, Zhengkai Jiang, Jinlong Peng, Liang Liu, Qiangang Du, Xiaobin Hu, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

In the field of multi-class anomaly detection, reconstruction-based methods derived from single-class anomaly detection face the well-known challenge of ``learning shortcuts'', wherein the model fails to learn the patterns of normal samples as it should, opting instead for shortcuts such as identity mapping or artificial noise elimination.

Anomaly Detection

Paper
Add Code

PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via Foundation Models

no code implementations • 11 Mar 2024 • Qingdong He, Jinlong Peng, Zhengkai Jiang, Xiaobin Hu, Jiangning Zhang, Qiang Nie, Yabiao Wang, Chengjie Wang

On top of that, PointSeg can incorporate with various segmentation models and even surpasses the supervised methods.

Scene Segmentation

Paper
Add Code

Dual-path Frequency Discriminators for Few-shot Anomaly Detection

no code implementations • 7 Mar 2024 • Yuhu Bai, Jiangning Zhang, Yuhang Dong, Guanzhong Tian, Liang Liu, Yunkang Cao, Yabiao Wang, Chengjie Wang

We consider anomaly detection as a discriminative classification problem, wherefore the dual-path feature discrimination module is employed to detect and locate the image-level and feature-level anomalies in the feature space.

Anomaly Detection

Paper
Add Code

UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with Fine-Grained Feature Representation

1 code implementation • 21 Jan 2024 • Qingdong He, Jinlong Peng, Zhengkai Jiang, Kai Wu, Xiaozhong Ji, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Mingang Chen, Yunsheng Wu

3D open-vocabulary scene understanding aims to recognize arbitrary novel categories beyond the base label space.

Instance Segmentation Scene Understanding +1

Paper
Code

Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection

no code implementations • 6 Jan 2024 • Yuanpeng Tu, Boshen Zhang, Liang Liu, Yuxi Li, Xuhai Chen, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cai Rong Zhao

Industrial anomaly detection is generally addressed as an unsupervised task that aims at locating defects with only normal training samples.

Anomaly Detection

Paper
Add Code

Density Matters: Improved Core-set for Active Domain Adaptive Segmentation

no code implementations • 15 Dec 2023 • Shizhan Liu, Zhengkai Jiang, Yuxi Li, Jinlong Peng, Yabiao Wang, Weiyao Lin

Active domain adaptation has emerged as a solution to balance the expensive annotation cost and the performance of trained models in semantic segmentation.

Domain Adaptation Semantic Segmentation

Paper
Add Code

Exploring Plain ViT Reconstruction for Multi-class Unsupervised Anomaly Detection

1 code implementation • 12 Dec 2023 • Jiangning Zhang, Xuhai Chen, Yabiao Wang, Chengjie Wang, Yong liu, Xiangtai Li, Ming-Hsuan Yang, DaCheng Tao

Following this spirit, this paper explores plain ViT architecture for MUAD.

Unsupervised Anomaly Detection

Paper
Code

DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection

1 code implementation • 11 Dec 2023 • Haoyang He, Jiangning Zhang, Hongxu Chen, Xuhai Chen, Zhishan Li, Xu Chen, Yabiao Wang, Chengjie Wang, Lei Xie

Reconstruction-based approaches have achieved remarkable outcomes in anomaly detection.

Anomaly Detection Denoising +1

Paper
Code

AnomalyDiffusion: Few-Shot Anomaly Image Generation with Diffusion Model

1 code implementation • 10 Dec 2023 • Teng Hu, Jiangning Zhang, Ran Yi, Yuzhen Du, Xu Chen, Liang Liu, Yabiao Wang, Chengjie Wang

Existing anomaly inspection methods are limited in their performance due to insufficient anomaly data.

Image Generation

Paper
Code

GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection

1 code implementation • 5 Nov 2023 • Jiangning Zhang, Haoyang He, Xuhai Chen, Zhucun Xue, Yabiao Wang, Chengjie Wang, Lei Xie, Yong liu

Large Multimodal Model (LMM) GPT-4V(ision) endows GPT-4 with visual grounding capabilities, making it possible to handle certain tasks through the Visual Question Answering (VQA) paradigm.

Anomaly Detection Question Answering +3

Paper
Code

CLIP-AD: A Language-Guided Staged Dual-Path Model for Zero-shot Anomaly Detection

no code implementations • 1 Nov 2023 • Xuhai Chen, Jiangning Zhang, Guanzhong Tian, Haoyang He, Wuhao Zhang, Yabiao Wang, Chengjie Wang, Yong liu

This paper considers zero-shot Anomaly Detection (AD), performing AD without reference images of the test objects.

Anomaly Detection Language Modelling +2

Paper
Add Code

Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption

1 code implementation • ICCV 2023 • Teng Hu, Jiangning Zhang, Liang Liu, Ran Yi, Siqi Kou, Haokun Zhu, Xu Chen, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To address these problems, we propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model.

Domain Adaptation

Paper
Code

Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region

2 code implementations • 7 Sep 2023 • Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, Lizhuang Ma

To solve the problem, we propose Compositional Neural Painter, a novel stroke-based rendering framework which dynamically predicts the next painting region based on the current canvas, instead of dividing the image plane uniformly into painting regions.

Style Transfer

Paper
Code

Toward High Quality Facial Representation Learning

1 code implementation • 7 Sep 2023 • Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Liang Liu, Yabiao Wang, Chengjie Wang

To improve the facial representation quality, we use feature map of a pre-trained visual backbone as a supervision item and use a partially pre-trained decoder for mask image modeling.

Contrastive Learning Decoder +3

Paper
Code

IIDM: Inter and Intra-domain Mixing for Semi-supervised Domain Adaptation in Semantic Segmentation

no code implementations • 30 Aug 2023 • WeiFu Fu, Qiang Nie, Jialin Li, Yuhuan Lin, Kai Wu, Jian Li, Yabiao Wang, Yong liu, Chengjie Wang

In this paper, we highlight the significance of exploiting the intra-domain information between the labeled target data and unlabeled target data.

Semantic Segmentation Semi-supervised Domain Adaptation +1

Paper
Add Code

PVG: Progressive Vision Graph for Vision Recognition

no code implementations • 1 Aug 2023 • Jiafu Wu, Jian Li, Jiangning Zhang, Boshen Zhang, Mingmin Chi, Yabiao Wang, Chengjie Wang

Convolution-based and Transformer-based vision backbone networks process images into the grid or sequence structures, respectively, which are inflexible for capturing irregular objects.

graph construction

Paper
Add Code

RFENet: Towards Reciprocal Feature Evolution for Glass Segmentation

1 code implementation • 12 Jul 2023 • Ke Fan, Changan Wang, Yabiao Wang, Chengjie Wang, Ran Yi, Lizhuang Ma

Glass-like objects are widespread in daily life but remain intractable to be segmented for most existing methods.

Semantic Segmentation

Paper
Code

Align, Perturb and Decouple: Toward Better Leverage of Difference Information for RSI Change Detection

1 code implementation • 30 May 2023 • Supeng Wang, Yuxi Li, Ming Xie, Mingmin Chi, Yabiao Wang, Chengjie Wang, Wenbing Zhu

In this paper, we revisit the importance of feature difference for change detection in RSI, and propose a series of operations to fully exploit the difference information: Alignment, Perturbation and Decoupling (APD).

Change Detection Decoder

Paper
Code

Dual Path Transformer with Partition Attention

no code implementations • 24 May 2023 • Zhengkai Jiang, Liang Liu, Jiangning Zhang, Yabiao Wang, Mingang Chen, Chengjie Wang

This paper introduces a novel attention mechanism, called dual attention, which is both efficient and effective.

Image Classification object-detection +2

Paper
Add Code

Learning Global-aware Kernel for Image Harmonization

no code implementations • ICCV 2023 • Xintian Shen, Jiangning Zhang, Jun Chen, Shipeng Bai, Yue Han, Yabiao Wang, Chengjie Wang, Yong liu

To address this issue, we propose a novel Global-aware Kernel Network (GKNet) to harmonize local regions with comprehensive consideration of long-distance background references.

Ranked #5 on Image Harmonization on iHarmony4

Image Harmonization

Paper
Add Code

Transavs: End-To-End Audio-Visual Segmentation With Transformer

no code implementations • 12 May 2023 • Yuhang Ling, Yuxi Li, Zhenye Gan, Jiangning Zhang, Mingmin Chi, Yabiao Wang

Generally AVS faces two key challenges: (1) Audio signals inherently exhibit a high degree of information density, as sounds produced by multiple objects are entangled within the same audio stream; (2) Objects of the same category tend to produce similar audio signals, making it difficult to distinguish between them and thus leading to unclear segmentation results.

Scene Understanding Segmentation +1

Paper
Add Code

Better "CMOS" Produces Clearer Images: Learning Space-Variant Blur Estimation for Blind Image Super-Resolution

1 code implementation • CVPR 2023 • Xuhai Chen, Jiangning Zhang, Chao Xu, Yabiao Wang, Chengjie Wang, Yong liu

Most of the existing blind image Super-Resolution (SR) methods assume that the blur kernels are space-invariant.

Image Super-Resolution SSIM

Paper
Code

MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-Supervised Object Detection

1 code implementation • CVPR 2023 • Liang Liu, Boshen Zhang, Jiangning Zhang, Wuhao Zhang, Zhenye Gan, Guanzhong Tian, Wenbing Zhu, Yabiao Wang, Chengjie Wang

Despite the remarkable progress made by modern detection models, this challenge is particularly evident in the semi-supervised case.

Ranked #3 on Semi-Supervised Object Detection on COCO 2% labeled data

Object object-detection +3

Paper
Code

Calibrated Teacher for Sparsely Annotated Object Detection

1 code implementation • 14 Mar 2023 • Haohan Wang, Liang Liu, Boshen Zhang, Jiangning Zhang, Wuhao Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang

Recent works on sparsely annotated object detection alleviate this problem by generating pseudo labels for the missing annotations.

Object object-detection +2

Paper
Code

Iterative Few-shot Semantic Segmentation from Image Label Text

1 code implementation • 10 Mar 2023 • Haohan Wang, Liang Liu, Wuhao Zhang, Jiangning Zhang, Zhenye Gan, Yabiao Wang, Chengjie Wang, Haoqian Wang

Few-shot semantic segmentation aims to learn to segment unseen class objects with the guidance of only a few support images.

Ranked #41 on Few-Shot Semantic Segmentation on COCO-20i (1-shot)

Few-Shot Semantic Segmentation Language Modelling +1

Paper
Code

Multimodal Industrial Anomaly Detection via Hybrid Fusion

1 code implementation • CVPR 2023 • Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, Chengjie Wang

2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields.

Ranked #3 on RGB+3D Anomaly Detection and Segmentation on MVTEC 3D-AD (using extra training data)

Contrastive Learning RGB+3D Anomaly Detection and Segmentation

124

Paper
Code

Learning with Noisy labels via Self-supervised Adversarial Noisy Masking

1 code implementation • CVPR 2023 • Yuanpeng Tu, Boshen Zhang, Yuxi Li, Liang Liu, Jian Li, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cai Rong Zhao

Collecting large-scale datasets is crucial for training deep models, annotating the data, however, inevitably yields noisy labels, which poses challenges to deep learning algorithms.

Ranked #2 on Image Classification on Clothing1M (using extra training data)

Learning with noisy labels

Paper
Code

Self-Supervised Likelihood Estimation with Energy Guidance for Anomaly Segmentation in Urban Scenes

1 code implementation • 14 Feb 2023 • Yuanpeng Tu, Yuxi Li, Boshen Zhang, Liang Liu, Jiangning Zhang, Yabiao Wang, Cai Rong Zhao

Based on the proposed estimators, we devise an adaptive self-supervised training framework, which exploits the contextual reliance and estimated likelihood to refine mask annotations in anomaly areas.

Anomaly Detection Autonomous Driving

Paper
Code

Learning from Noisy Labels with Decoupled Meta Label Purifier

1 code implementation • CVPR 2023 • Yuanpeng Tu, Boshen Zhang, Yuxi Li, Liang Liu, Jian Li, Yabiao Wang, Chengjie Wang, Cai Rong Zhao

Training deep neural networks(DNN) with noisy labels is challenging since DNN can easily memorize inaccurate labels, leading to poor generalization ability.

Ranked #5 on Image Classification on Clothing1M (using clean data)

Image Classification Meta-Learning +1

Paper
Code

Exploring Efficient Few-shot Adaptation for Vision Transformers

1 code implementation • 6 Jan 2023 • Chengming Xu, Siqian Yang, Yabiao Wang, Zhanxiong Wang, Yanwei Fu, xiangyang xue

Essentially, despite ViTs have been shown to enjoy comparable or even better performance on other vision tasks, it is still very nontrivial to efficiently finetune the ViTs in real-world FSL scenarios.

Few-Shot Learning

Paper
Code

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

1 code implementation • 3 Jan 2023 • Yue Han, Jiangning Zhang, Zhucun Xue, Chao Xu, Xintian Shen, Yabiao Wang, Chengjie Wang, Yong liu, Xiangtai Li

In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework.

Benchmarking Few-Shot Object Detection +3

Paper
Code

Rethinking Mobile Block for Efficient Attention-based Models

1 code implementation • ICCV 2023 • Jiangning Zhang, Xiangtai Li, Jian Li, Liang Liu, Zhucun Xue, Boshen Zhang, Zhengkai Jiang, Tianxin Huang, Yabiao Wang, Chengjie Wang

This paper focuses on developing modern, efficient, lightweight models for dense predictions while trading off parameters, FLOPs, and performance.

Unity

217

Paper
Code

Remembering Normality: Memory-guided Knowledge Distillation for Unsupervised Anomaly Detection

2 code implementations • ICCV 2023 • Zhihao Gu, Liang Liu, Xu Chen, Ran Yi, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Annan Shu, Guannan Jiang, Lizhuang Ma

Specifically, we first propose a normality recall memory (NR Memory) to strengthen the normality of student-generated features by recalling the stored normal information.

Ranked #11 on Anomaly Detection on MVTec AD

Knowledge Distillation Unsupervised Anomaly Detection

Paper
Code

Split-PU: Hardness-aware Training Strategy for Positive-Unlabeled Learning

1 code implementation • 30 Nov 2022 • Chengming Xu, Chen Liu, Siqian Yang, Yabiao Wang, Shijie Zhang, Lijie Jia, Yanwei Fu

Since only part of the most confident positive samples are available and evidence is not enough to categorize the rest samples, many of these unlabeled data may also be the positive samples.

Binary Classification

Paper
Code

PatchMix Augmentation to Identify Causal Features in Few-shot Learning

no code implementations • 29 Nov 2022 • Chengming Xu, Chen Liu, Xinwei Sun, Siqian Yang, Yabiao Wang, Chengjie Wang, Yanwei Fu

We theoretically show that such an augmentation mechanism, different from existing ones, is able to identify the causal features.

Data Augmentation Few-Shot Learning +1

Paper
Add Code

Learning from Noisy Labels with Coarse-to-Fine Sample Credibility Modeling

no code implementations • 23 Aug 2022 • Boshen Zhang, Yuxi Li, Yuanpeng Tu, Jinlong Peng, Yabiao Wang, Cunlin Wu, Yang Xiao, Cairong Zhao

Specifically, for the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus alleviating the effect from noisy samples incorrectly grouped into the clean set.

Denoising Image Classification

Paper
Add Code

Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation

1 code implementation • 14 Jul 2022 • Zhengkai Jiang, Yuxi Li, Ceyuan Yang, Peng Gao, Yabiao Wang, Ying Tai, Chengjie Wang

Unsupervised Domain Adaptation (UDA) aims to adapt the model trained on the labeled source domain to an unlabeled target domain.

Ranked #14 on Unsupervised Domain Adaptation on SYNTHIA-to-Cityscapes

Contrastive Learning Semantic Segmentation +1

Paper
Code

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm

1 code implementation • 19 Jun 2022 • Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong liu, DaCheng Tao

Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derives that both have consistent mathematical formulation.

Image Classification

Paper
Code

FRIH: Fine-grained Region-aware Image Harmonization

no code implementations • 13 May 2022 • Jinlong Peng, Zekun Luo, Liang Liu, Boshen Zhang, Tao Wang, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin

Image harmonization aims to generate a more realistic appearance of foreground and background for a composite image.

Decoder Image Harmonization

Paper
Add Code

Learning Distinctive Margin toward Active Domain Adaptation

1 code implementation • CVPR 2022 • Ming Xie, Yuxi Li, Yabiao Wang, Zekun Luo, Zhenye Gan, Zhongyi Sun, Mingmin Chi, Chengjie Wang, Pei Wang

Despite plenty of efforts focusing on improving the domain adaptation ability (DA) under unsupervised or few-shot semi-supervised settings, recently the solution of active learning started to attract more attention due to its suitability in transferring model in a more practical way with limited annotation resource on target data.

Active Learning Domain Adaptation

Paper
Code

STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation

no code implementations • 8 Feb 2022 • Zhengkai Jiang, Zhangxuan Gu, Jinlong Peng, Hang Zhou, Liang Liu, Yabiao Wang, Ying Tai, Chengjie Wang, Liqing Zhang

In contrast, we present a simple and efficient single-stage VIS framework based on the instance segmentation method CondInst by adding an extra tracking head.

Ranked #36 on Video Instance Segmentation on YouTube-VIS validation

Contrastive Learning Instance Segmentation +3

Paper
Add Code

ASFD: Automatic and Scalable Face Detector

no code implementations • 26 Jan 2022 • Jian Li, Bin Zhang, Yabiao Wang, Ying Tai, Zhenyu Zhang, Chengjie Wang, Jilin Li, Xiaoming Huang, Yili Xia

Along with current multi-scale based detectors, Feature Aggregation and Enhancement (FAE) modules have shown superior performance gains for cutting-edge object detection.

Ranked #1 on Face Detection on WIDER Face (Medium)

Face Detection object-detection +1

Paper
Add Code

SCSNet: An Efficient Paradigm for Learning Simultaneously Image Colorization and Super-Resolution

no code implementations • 12 Jan 2022 • Jiangning Zhang, Chao Xu, Jian Li, Yue Han, Yabiao Wang, Ying Tai, Yong liu

In the practical application of restoring low-resolution gray-scale images, we generally need to run three separate processes of image colorization, super-resolution, and dows-sampling operation for the target device.

Colorization Image Colorization +1

Paper
Add Code

ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-High Resolution Segmentation

1 code implementation • CVPR 2022 • Shaohua Guo, Liang Liu, Zhenye Gan, Yabiao Wang, Wuhao Zhang, Chengjie Wang, Guannan Jiang, Wei zhang, Ran Yi, Lizhuang Ma, Ke Xu

The huge burden of computation and memory are two obstacles in ultra-high resolution image segmentation.

Image Segmentation Segmentation +1

Paper
Code

LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization

no code implementations • 10 Dec 2021 • Zhiwei Chen, Changan Wang, Yabiao Wang, Guannan Jiang, Yunhang Shen, Ying Tai, Chengjie Wang, Wei zhang, Liujuan Cao

In this paper, we propose a novel framework built upon the transformer, termed LCTR (Local Continuity TRansformer), which targets at enhancing the local perception capability of global features among long-range feature dependencies.

Inductive Bias Object +1

Paper
Add Code

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

1 code implementation • 19 Oct 2021 • Yuxi Li, Boshen Zhang, Jian Li, Yabiao Wang, Weiyao Lin, Chengjie Wang, Jilin Li, Feiyue Huang

We demonstrate that both temporal grains are beneficial to atomic action recognition.

Action Detection Atomic action recognition

Paper
Code

Robust Learning with Adaptive Sample Credibility Modeling

no code implementations • 29 Sep 2021 • Boshen Zhang, Yuxi Li, Yuanpeng Tu, Yabiao Wang, Yang Xiao, Cai Rong Zhao, Chengjie Wang

For the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus to alleviate the effect from potential hard noisy samples in clean set.

Denoising

Paper
Add Code

Uniformity in Heterogeneity:Diving Deep into Count Interval Partition for Crowd Counting

3 code implementations • 27 Jul 2021 • Changan Wang, Qingyu Song, Boshen Zhang, Yabiao Wang, Ying Tai, Xuyi Hu, Chengjie Wang, Jilin Li, Jiayi Ma, Yang Wu

Therefore, we propose a novel count interval partition criterion called Uniform Error Partition (UEP), which always keeps the expected counting error contributions equal for all intervals to minimize the prediction risk.

Crowd Counting Quantization

399

Paper
Code

Rethinking Counting and Localization in Crowds:A Purely Point-Based Framework

3 code implementations • 27 Jul 2021 • Qingyu Song, Changan Wang, Zhengkai Jiang, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yang Wu

In this paper, we propose a purely point-based framework for joint crowd counting and individual localization.

Ranked #4 on Crowd Counting on ShanghaiTech A

Crowd Counting

399

Paper
Code

NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection

1 code implementation • 4 Jun 2021 • Zekun Luo, Zheng Fang, Sixiao Zheng, Yabiao Wang, Yanwei Fu

Non-Maximum Suppression (NMS) is essential for object detection and affects the evaluation results by incorporating False Positives (FP) and False Negatives (FN), especially in crowd occlusion scenes.

Ranked #6 on Pedestrian Detection on Caltech

object-detection Object Detection +1

Paper
Code

Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model

1 code implementation • NeurIPS 2021 • Jiangning Zhang, Chao Xu, Jian Li, Wenzhou Chen, Yabiao Wang, Ying Tai, Shuo Chen, Chengjie Wang, Feiyue Huang, Yong liu

Inspired by biological evolution, we explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derive that both of them have consistent mathematical representation.

Image Retrieval Retrieval

Paper
Code

SiamRCR: Reciprocal Classification and Regression for Visual Object Tracking

no code implementations • 24 May 2021 • Jinlong Peng, Zhengkai Jiang, Yueyang Gu, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin

In addition, we add a localization branch to predict the localization accuracy, so that it can work as the replacement of the regression assistance link during inference.

Classification Object +2

Paper
Add Code

Learning Salient Boundary Feature for Anchor-free Temporal Action Localization

1 code implementation • CVPR 2021 • Chuming Lin, Chengming Xu, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu

Temporal action localization is an important yet challenging task in video understanding.

Temporal Action Localization Temporal Localization +1

168

Paper
Code

Learning Comprehensive Motion Representation for Action Recognition

no code implementations • 23 Mar 2021 • Mingyu Wu, Boyuan Jiang, Donghao Luo, Junchi Yan, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xiaokang Yang

For action recognition learning, 2D CNN-based methods are efficient but may yield redundant features due to applying the same 2D convolution kernel to each frame.

Action Recognition

Paper
Add Code

Uniformity in Heterogeneity: Diving Deep Into Count Interval Partition for Crowd Counting

1 code implementation • ICCV 2021 • Changan Wang, Qingyu Song, Boshen Zhang, Yabiao Wang, Ying Tai, Xuyi Hu, Chengjie Wang, Jilin Li, Jiayi Ma, Yang Wu

Crowd Counting Quantization

Paper
Code

Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework

1 code implementation • ICCV 2021 • Qingyu Song, Changan Wang, Zhengkai Jiang, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yang Wu

In this paper, we propose a purely point-based framework for joint crowd counting and individual localization.

Crowd Counting

399

Paper
Code

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

5 code implementations • CVPR 2021 • Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H. S. Torr, Li Zhang

In this paper, we aim to provide an alternative perspective by treating semantic segmentation as a sequence-to-sequence prediction task.

Ranked #2 on Semantic Segmentation on FoodSeg103 (using extra training data)

Decoder Medical Image Segmentation +2

8,354

Paper
Code

Dense Scene Multiple Object Tracking with Box-Plane Matching

no code implementations • 30 Jul 2020 • Jinlong Peng, Yueyang Gu, Yabiao Wang, Chengjie Wang, Jilin Li, Feiyue Huang

Multiple Object Tracking (MOT) is an important task in computer vision.

Multiple Object Tracking Object

Paper
Add Code

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

1 code implementation • ECCV 2020 • Jinlong Peng, Changan Wang, Fangbin Wan, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu

Existing Multiple-Object Tracking (MOT) methods either follow the tracking-by-detection paradigm to conduct object detection, feature extraction and data association separately, or have two of the three subtasks integrated to form a partially end-to-end solution.

Multiple Object Tracking Object +3

246

Paper
Code

Temporal Distinct Representation Learning for Action Recognition

no code implementations • ECCV 2020 • Junwu Weng, Donghao Luo, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xudong Jiang, Junsong Yuan

Motivated by the previous success of Two-Dimensional Convolutional Neural Network (2D CNN) on image recognition, researchers endeavor to leverage it to characterize videos.

Action Recognition Representation Learning

Paper
Add Code

ACFD: Asymmetric Cartoon Face Detector

2 code implementations • 2 Jul 2020 • Bin Zhang, Jian Li, Yabiao Wang, Zhipeng Cui, Yili Xia, Chengjie Wang, Jilin Li, Feiyue Huang

Cartoon face detection is a more challenging task than human face detection due to many difficult scenarios is involved.

Binary Classification Face Detection

Paper
Code

Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation

2 code implementations • CVPR 2020 • Liang Liu, Jiangning Zhang, Ruifei He, Yong liu, Yabiao Wang, Ying Tai, Donghao Luo, Chengjie Wang, Jilin Li, Feiyue Huang

Unsupervised learning of optical flow, which leverages the supervision from view synthesis, has emerged as a promising alternative to supervised methods.

Ranked #2 on Optical Flow Estimation on KITTI 2012 unsupervised

Decoder Optical Flow Estimation +1

249

Paper
Code

ASFD: Automatic and Scalable Face Detector

no code implementations • 25 Mar 2020 • Bin Zhang, Jian Li, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yili Xia, Wenjiang Pei, Rongrong Ji

In this paper, we propose a novel Automatic and Scalable Face Detector (ASFD), which is based on a combination of neural architecture search techniques as well as a new loss design.

Neural Architecture Search

Paper
Add Code

TEINet: Towards an Efficient Architecture for Video Recognition

no code implementations • 21 Nov 2019 • Zhao-Yang Liu, Donghao Luo, Yabiao Wang, Li-Min Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Tong Lu

To relieve this problem, we propose an efficient temporal module, termed as Temporal Enhancement-and-Interaction (TEI Module), which could be plugged into the existing 2D CNNs (denoted by TEINet).

Action Recognition Video Recognition

Paper
Add Code

Fast Learning of Temporal Action Proposal via Dense Boundary Generator

3 code implementations • 11 Nov 2019 • Chuming Lin, Jian Li, Yabiao Wang, Ying Tai, Donghao Luo, Zhipeng Cui, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji

In this paper, we propose an efficient and unified framework to generate temporal action proposals named Dense Boundary Generator (DBG), which draws inspiration from boundary-sensitive methods and implements boundary classification and action completeness regression for densely distributed proposals.

Ranked #7 on Temporal Action Localization on FineAction

General Classification Optical Flow Estimation +2

345

Paper
Code

DSFD: Dual Shot Face Detector

4 code implementations • CVPR 2019 • Jian Li, Yabiao Wang, Changan Wang, Ying Tai, Jianjun Qian, Jian Yang, Chengjie Wang, Jilin Li, Feiyue Huang

In this paper, we propose a novel face detection network with three novel contributions that address three key aspects of face detection, including better feature learning, progressive loss design and anchor assign based data augmentation, respectively.

Ranked #1 on Face Detection on FDDB

Data Augmentation Occluded Face Detection

2,869

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.