Search Results for author: Feng Zheng

Found 92 papers, 50 papers with code

Enabling Deep Residual Networks for Weakly Supervised Object Detection

no code implementations • ECCV 2020 • Yunhang Shen, Rongrong Ji, Yan Wang, Zhiwei Chen, Feng Zheng, Feiyue Huang, Yunsheng Wu

Weakly supervised object detection (WSOD) has attracted extensive research attention due to its great flexibility of exploiting large-scale image-level annotation for detector training.

Object object-detection +1

Paper
Add Code

LLplace: The 3D Indoor Scene Layout Generation and Editing via Large Language Model

no code implementations • 6 Jun 2024 • Yixuan Yang, Junru Lu, Zixiang Zhao, Zhen Luo, James J. Q. Yu, Victor Sanchez, Feng Zheng

In this paper, we introduce LLplace, a novel 3D indoor scene layout designer based on lightweight fine-tuned open-source LLM Llama3.

Language Modelling Large Language Model +2

Paper
Add Code

On the Noise Robustness of In-Context Learning for Text Generation

no code implementations • 27 May 2024 • Hongfu Gao, Feipeng Zhang, Wenyu Jiang, Jun Shu, Feng Zheng, Hongxin Wei

In this work, we show that, on text generation tasks, noisy annotations significantly hurt the performance of in-context learning.

In-Context Learning text-classification +2

Paper
Add Code

Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer

no code implementations • 26 Apr 2024 • Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li

Emotion recognition aims to discern the emotional state of subjects within an image, relying on subject-centric and contextual visual cues.

Emotion Classification Emotion Recognition

Paper
Add Code

UniAV: Unified Audio-Visual Perception for Multi-Task Video Localization

1 code implementation • 4 Apr 2024 • Tiantian Geng, Teng Wang, yanfu Zhang, Jinming Duan, Weili Guan, Feng Zheng

Video localization tasks aim to temporally locate specific instances in videos, including temporal action localization (TAL), sound event detection (SED) and audio-visual event localization (AVEL).

audio-visual event localization Event Detection +2

Paper
Code

Negative Label Guided OOD Detection with Pretrained Vision-Language Models

1 code implementation • 29 Mar 2024 • Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, Bo Han

In this paper, we propose a novel post hoc OOD detection method, called NegLabel, which takes a vast number of negative labels from extensive corpus databases.

Out of Distribution (OOD) Detection

Paper
Code

Toward Multi-class Anomaly Detection: Exploring Class-aware Unified Model against Inter-class Interference

no code implementations • 21 Mar 2024 • Xi Jiang, Ying Chen, Qiang Nie, Jianlin Liu, Yong liu, Chengjie Wang, Feng Zheng

To address this issue, we introduce a Multi-class Implicit Neural representation Transformer for unified Anomaly Detection (MINT-AD), which leverages the fine-grained category information in the training stage.

Anomaly Detection Decoder

Paper
Add Code

SoftPatch: Unsupervised Anomaly Detection with Noisy Data

1 code implementation • NeurIPS 2022 • Xi Jiang, Ying Chen, Qiang Nie, Yong liu, Jianlin Liu, Bin-Bin Gao, Jun Liu, Chengjie Wang, Feng Zheng

Noise discriminators are utilized to generate outlier scores for patch-level noise elimination before coreset construction.

Unsupervised Anomaly Detection

Paper
Code

Tuning-Free Image Customization with Image and Text Guidance

no code implementations • 19 Mar 2024 • Pengzhi Li, Qiang Nie, Ying Chen, Xi Jiang, Kai Wu, Yuhuan Lin, Yong liu, Jinlong Peng, Chengjie Wang, Feng Zheng

To our knowledge, this is the first tuning-free method that concurrently utilizes text and image guidance for image customization in specific regions.

Decoder Denoising +1

Paper
Add Code

Place Anything into Any Video

no code implementations • 22 Feb 2024 • Ziling Liu, Jinyu Yang, Mingqi Gao, Feng Zheng

This paper introduces a novel and efficient system named Place-Anything, which facilitates the insertion of any object into any video solely based on a picture or text description of the target object or element.

3D Generation Object +2

Paper
Add Code

Unsupervised Continual Anomaly Detection with Contrastively-learned Prompt

1 code implementation • 2 Jan 2024 • Jiaqi Liu, Kai Wu, Qiang Nie, Ying Chen, Bin-Bin Gao, Yong liu, Jinbao Wang, Chengjie Wang, Feng Zheng

Unsupervised Anomaly Detection (UAD) with incremental training is crucial in industrial manufacturing, as unpredictable defects make obtaining sufficient labeled data infeasible.

continual anomaly detection Continual Learning +2

Paper
Code

Video Understanding with Large Language Models: A Survey

1 code implementation • 29 Dec 2023 • Yunlong Tang, Jing Bi, Siting Xu, Luchuan Song, Susan Liang, Teng Wang, Daoan Zhang, Jie An, Jingyang Lin, Rongyi Zhu, Ali Vosoughi, Chao Huang, Zeliang Zhang, Feng Zheng, JianGuo Zhang, Ping Luo, Jiebo Luo, Chenliang Xu

With the burgeoning growth of online video platforms and the escalating volume of video content, the demand for proficient video understanding tools has intensified markedly.

Video Understanding

805

Paper
Code

Beyond Prototypes: Semantic Anchor Regularization for Better Representation Learning

1 code implementation • 19 Dec 2023 • Yanqi Ge, Qiang Nie, Ye Huang, Yong liu, Chengjie Wang, Feng Zheng, Wen Li, Lixin Duan

By pulling the learned features to these semantic anchors, several advantages can be attained: 1) the intra-class compactness and naturally inter-class separability, 2) induced bias or errors from feature learning can be avoided, and 3) robustness to the long-tailed problem.

Disentanglement

Paper
Code

Real3D-AD: A Dataset of Point Cloud Anomaly Detection

1 code implementation • NeurIPS 2023 • Jiaqi Liu, Guoyang Xie, Ruitao Chen, Xinpeng Li, Jinbao Wang, Yong liu, Chengjie Wang, Feng Zheng

High-precision point cloud anomaly detection is the gold standard for identifying the defects of advancing machining and precision manufacturing.

3D Anomaly Detection

Paper
Code

Learning Cross-Modal Affinity for Referring Video Object Segmentation Targeting Limited Samples

1 code implementation • ICCV 2023 • Guanghui Li, Mingqi Gao, Heng Liu, XianTong Zhen, Feng Zheng

Referring video object segmentation (RVOS), as a supervised learning task, relies on sufficient annotated data for a given scene.

Referring Video Object Segmentation Semantic Segmentation +1

Paper
Code

Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models

no code implementations • ICCV 2023 • Baoshuo Kan, Teng Wang, Wenpeng Lu, XianTong Zhen, Weili Guan, Feng Zheng

Pre-trained vision-language models, e. g., CLIP, working with manually designed prompts have demonstrated great capacity of transfer learning.

Few-Shot Image Classification Transfer Learning

Paper
Add Code

Point-aware Interaction and CNN-induced Refinement Network for RGB-D Salient Object Detection

2 code implementations • 17 Aug 2023 • Runmin Cong, Hongyu Liu, Chen Zhang, Wei zhang, Feng Zheng, Ran Song, Sam Kwong

By integrating complementary information from RGB image and depth map, the ability of salient object detection (SOD) for complex and challenging scenes can be improved.

object-detection RGB-D Salient Object Detection +1

Paper
Code

Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

1 code implementation • ICCV 2023 • Junjie Fei, Teng Wang, Jinrui Zhang, Zhenyu He, Chengjie Wang, Feng Zheng

In this paper, we propose ViECap, a transferable decoding model that leverages entity-aware decoding to generate descriptions in both seen and unseen scenarios.

Caption Generation Hallucination +2

137

Paper
Code

Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models

1 code implementation • ICCV 2023 • Dong Lu, Zhiqiang Wang, Teng Wang, Weili Guan, Hongchang Gao, Feng Zheng

Vision-language pre-training (VLP) models have shown vulnerability to adversarial examples in multimodal tasks.

Retrieval Text Retrieval

Paper
Code

EasyNet: An Easy Network for 3D Industrial Anomaly Detection

no code implementations • 26 Jul 2023 • Ruitao Chen, Guoyang Xie, Jiaqi Liu, Jinbao Wang, Ziqi Luo, Jinfan Wang, Feng Zheng

3D anomaly detection is an emerging and vital computer vision task in industrial manufacturing (IM).

3D Anomaly Detection Decoder

Paper
Add Code

K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality Assessment

no code implementations • 10 Jul 2023 • Guoyang Xie, Jinbao Wang, Yawen Huang, Jiayi Lyu, Feng Zheng, Yefeng Zheng, Yaochu Jin

To further reflect the frequency-specific information from the magnetic resonance imaging principles, both k-space features and vision features are obtained and employed in our comprehensive encoders with a frequency reconstruction penalty.

Image Generation SSIM

Paper
Add Code

LaunchpadGPT: Language Model as Music Visualization Designer on Launchpad

1 code implementation • 7 Jul 2023 • Siting Xu, Yunlong Tang, Feng Zheng

To assist and inspire the design of the Launchpad light effect, and provide a more accessible approach for beginners to create music visualization with this instrument, we proposed the LaunchpadGPT model to generate music visualization designs on Launchpad automatically.

Language Modelling

Paper
Code

LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning

1 code implementation • 17 Jun 2023 • Yunlong Tang, Jinrui Zhang, Xiangchen Wang, Teng Wang, Feng Zheng

This paper proposes an effective model LLMVA-GEBC (Large Language Model with Video Adapter for Generic Event Boundary Captioning): (1) We utilize a pretrained LLM for generating human-like captions with high quality.

Boundary Captioning Language Modelling +1

Paper
Code

CageViT: Convolutional Activation Guided Efficient Vision Transformer

no code implementations • 17 May 2023 • Hao Zheng, Jinbao Wang, XianTong Zhen, Hong Chen, Jingkuan Song, Feng Zheng

Recently, Transformers have emerged as the go-to architecture for both vision and language modeling tasks, but their computational efficiency is limited by the length of the input sequence.

Computational Efficiency Image Classification +1

Paper
Add Code

Detecting Out-of-distribution Data through In-distribution Class Prior

1 code implementation • ICML 2023 • Xue Jiang, Feng Liu, Zhen Fang, Hong Chen, Tongliang Liu, Feng Zheng, Bo Han

In this paper, we show that this assumption makes the above methods incapable when the ID model is trained with class-imbalanced data. Fortunately, by analyzing the causal relations between ID/OOD classes and features, we identify several common scenarios where the OOD-to-ID probabilities should be the ID-class-prior distribution and propose two strategies to modify existing inference-time detection methods: 1) replace the uniform distribution with the ID-class-prior distribution if they explicitly use the uniform distribution; 2) otherwise, reweight their scores according to the similarity between the ID-class-prior distribution and the softmax outputs of the pre-trained model.

Ranked #15 on Out-of-Distribution Detection on ImageNet-1k vs Curated OODs (avg.)

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Paper
Code

Track Anything: Segment Anything Meets Videos

1 code implementation • 24 Apr 2023 • Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng

Therefore, in this report, we propose Track Anything Model (TAM), which achieves high-performance interactive tracking and segmentation in videos.

Image Segmentation Segmentation +2

6,196

Paper
Code

Can Decentralized Stochastic Minimax Optimization Algorithms Converge Linearly for Finite-Sum Nonconvex-Nonconcave Problems?

no code implementations • 24 Apr 2023 • Yihan Zhang, Wenhao Jiang, Feng Zheng, Chiu C. Tan, Xinghua Shi, Hongchang Gao

This motivates us to study decentralized minimax optimization algorithms for the nonconvex-nonconcave problem.

Paper
Add Code

What makes a good data augmentation for few-shot unsupervised image anomaly detection?

no code implementations • 6 Apr 2023 • Lingrui Zhang, Shuheng Zhang, Guoyang Xie, Jiaqi Liu, Hua Yan, Jinbao Wang, Feng Zheng, Yaochu Jin

Data augmentation is a promising technique for unsupervised anomaly detection in industrial applications, where the availability of positive samples is often limited due to factors such as commercial competition and sample collection difficulties.

Data Augmentation Unsupervised Anomaly Detection

Paper
Add Code

Accelerating Vision-Language Pretraining with Free Language Modeling

1 code implementation • CVPR 2023 • Teng Wang, Yixiao Ge, Feng Zheng, Ran Cheng, Ying Shan, XiaoHu Qie, Ping Luo

FLM successfully frees the prediction rate from the tie-up with the corruption rate while allowing the corruption spans to be customized for each token to be predicted.

Language Modelling Masked Language Modeling

Paper
Code

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline

1 code implementation • CVPR 2023 • Tiantian Geng, Teng Wang, Jinming Duan, Runmin Cong, Feng Zheng

To better adapt to real-life applications, in this paper we focus on the task of dense-localizing audio-visual events, which aims to jointly localize and recognize all audio-visual events occurring in an untrimmed video.

Ranked #1 on audio-visual event localization on UnAV-100

audio-visual event localization

Paper
Code

Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos

1 code implementation • 11 Mar 2023 • Teng Wang, Jinrui Zhang, Feng Zheng, Wenhao Jiang, Ran Cheng, Ping Luo

Our framework is easily extensible to tasks covering visually-grounded language understanding and generation.

Ranked #1 on Natural Language Moment Retrieval on ActivityNet Captions

Dense Video Captioning Natural Language Moment Retrieval +2

Paper
Code

On the Stability and Generalization of Triplet Learning

no code implementations • 20 Feb 2023 • Jun Chen, Hong Chen, Xue Jiang, Bin Gu, Weifu Li, Tieliang Gong, Feng Zheng

Triplet learning, i. e. learning from triplet data, has attracted much attention in computer vision tasks with an extremely large number of categories, e. g., face recognition and person re-identification.

Face Recognition Metric Learning +1

Paper
Add Code

IM-IAD: Industrial Image Anomaly Detection Benchmark in Manufacturing

2 code implementations • 31 Jan 2023 • Guoyang Xie, Jinbao Wang, Jiaqi Liu, Jiayi Lyu, Yong liu, Chengjie Wang, Feng Zheng, Yaochu Jin

We realize that the lack of a uniform IM benchmark is hindering the development and usage of IAD methods in real-world applications.

Anomaly Detection Continual Learning +1

Paper
Code

Pushing the Limits of Fewshot Anomaly Detection in Industry Vision: Graphcore

no code implementations • 28 Jan 2023 • Guoyang Xie, Jinbao Wang, Jiaqi Liu, Feng Zheng, Yaochu Jin

Besides, we provide a novel model GraphCore via VIIFs that can fast implement unsupervised FSAD training and can improve the performance of anomaly detection.

Anomaly Detection

Paper
Add Code

Deep Industrial Image Anomaly Detection: A Survey

1 code implementation • 27 Jan 2023 • Jiaqi Liu, Guoyang Xie, Jinbao Wang, Shangnian Li, Chengjie Wang, Feng Zheng, Yaochu Jin

In this paper, we provide a comprehensive review of deep learning-based image anomaly detection techniques, from the perspectives of neural network architectures, levels of supervision, loss functions, metrics and datasets.

Anomaly Detection

1,099

Paper
Code

Resource-Efficient RGBD Aerial Tracking

1 code implementation • CVPR 2023 • Jinyu Yang, Shang Gao, Zhe Li, Feng Zheng, Aleš Leonardis

However, current research on aerial perception has mainly focused on limited categories, such as pedestrian or vehicle, and most scenes are captured in urban environments from a birds-eye view.

Object Tracking

Paper
Code

Learning Dual-Fused Modality-Aware Representations for RGBD Tracking

no code implementations • 6 Nov 2022 • Shang Gao, Jinyu Yang, Zhe Li, Feng Zheng, Aleš Leonardis, Jingkuan Song

However, some existing RGBD trackers use the two modalities separately and thus some particularly useful shared information between them is ignored.

Object Tracking

Paper
Add Code

Towards Continual Adaptation in Industrial Anomaly Detection

1 code implementation • ACMMM 2022 • Wujin Li, Jiawei Zhan, Jinbao Wang, Bizhong Xia, Bin-Bin Gao, Jun Liu, Chengjie Wang, Feng Zheng

We believe that the proposed task and benchmark will be beneficial to the field of AD.

Anomaly Detection continual anomaly detection +2

Paper
Code

Does Thermal Really Always Matter for RGB-T Salient Object Detection?

2 code implementations • 9 Oct 2022 • Runmin Cong, Kepu Zhang, Chen Zhang, Feng Zheng, Yao Zhao, Qingming Huang, Sam Kwong

In addition, considering the role of thermal modality, we set up different cross-modality interaction mechanisms in the encoding phase and the decoding phase.

object-detection Object Detection +2

Paper
Code

Deep Manifold Hashing: A Divide-and-Conquer Approach for Semi-Paired Unsupervised Cross-Modal Retrieval

no code implementations • 26 Sep 2022 • Yufeng Shi, Xinge You, Jiamiao Xu, Feng Zheng, Qinmu Peng, Weihua Ou

Hashing that projects data into binary codes has shown extraordinary talents in cross-modal retrieval due to its low storage usage and high query speed.

Cross-Modal Retrieval Retrieval

Paper
Add Code

Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward

1 code implementation • 25 Sep 2022 • Yunlong Tang, Siting Xu, Teng Wang, Qin Lin, Qinglin Lu, Feng Zheng

The existing method performs well at video segmentation stages but suffers from the problems of dependencies on extra cumbersome models and poor performance at the segment assemblage stage.

Decoder Video Editing +2

Paper
Code

Pose-Aided Video-based Person Re-Identification via Recurrent Graph Convolutional Network

no code implementations • 23 Sep 2022 • Honghu Pan, Qiao Liu, Yongyong Chen, Yunqi He, Yuan Zheng, Feng Zheng, Zhenyu He

Finally, we propose a dual-attention method consisting of node-attention and time-attention to obtain the temporal graph representation from the node embeddings, where the self-attention mechanism is employed to learn the importance of each node and each frame.

Retrieval Video-Based Person Re-Identification +1

Paper
Add Code

Spatial-Temporal Pyramid Graph Reasoning for Action Recognition

no code implementations • TIP 2022 • Tiantian Geng, Feng Zheng, Xiaorong Hou, Ke Lu, Guo-Jun Qi, Ling Shao

Spatial-temporal relation reasoning is a significant yet challenging problem for video action recognition.

Ranked #35 on Action Recognition on Something-Something V1

Action Recognition Representation Learning +1

Paper
Add Code

S$^2$Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning

no code implementations • 1 Aug 2022 • Tze Ho Elden Tse, Zhongqun Zhang, Kwang In Kim, Ales Leonardis, Feng Zheng, Hyung Jin Chang

In this paper, we propose a novel semi-supervised framework that allows us to learn contact from monocular images.

hand-object pose Object

Paper
Add Code

Prompting for Multi-Modal Tracking

no code implementations • 29 Jul 2022 • Jinyu Yang, Zhe Li, Feng Zheng, Aleš Leonardis, Jingkuan Song

Multi-modal tracking gains attention due to its ability to be more accurate and robust in complex scenarios compared to traditional RGB-based tracking.

Ranked #21 on Rgb-T Tracking on LasHeR

Rgb-T Tracking

Paper
Add Code

Exploiting Context Information for Generic Event Boundary Captioning

1 code implementation • 3 Jul 2022 • Jinrui Zhang, Teng Wang, Feng Zheng, Ran Cheng, Ping Luo

Previous methods only process the information of a single boundary at a time, which lacks utilization of video context information.

Boundary Captioning

Paper
Code

VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix

1 code implementation • 17 Jun 2022 • Teng Wang, Wenhao Jiang, Zhichao Lu, Feng Zheng, Ran Cheng, Chengguo Yin, Ping Luo

Existing vision-language pre-training (VLP) methods primarily rely on paired image-text datasets, which are either annotated by enormous human labors, or crawled from the internet followed by elaborate data cleaning techniques.

Contrastive Learning Data Augmentation +2

Paper
Code

VITA: A Multi-Source Vicinal Transfer Augmentation Method for Out-of-Distribution Generalization

no code implementations • 25 Apr 2022 • Minghui Chen, Cheng Wen, Feng Zheng, Fengxiang He, Ling Shao

The tangent transfer creates initial augmented samples for improving corruption robustness.

Data Augmentation Out-of-Distribution Generalization

Paper
Add Code

Semantic-Aware Pretraining for Dense Video Captioning

no code implementations • 13 Apr 2022 • Teng Wang, Zhu Liu, Feng Zheng, Zhichao Lu, Ran Cheng, Ping Luo

This report describes the details of our approach for the event dense-captioning task in ActivityNet Challenge 2021.

Dense Captioning Dense Video Captioning

Paper
Add Code

RGBD Object Tracking: An In-depth Review

1 code implementation • 26 Mar 2022 • Jinyu Yang, Zhe Li, Song Yan, Feng Zheng, Aleš Leonardis, Joni-Kristian Kämäräinen, Ling Shao

Particularly, we are the first to provide depth quality evaluation and analysis of tracking results in depth-friendly scenarios in RGBD tracking.

Object Object Tracking

Paper
Code

Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression

1 code implementation • CVPR 2022 • Xiaosu Zhu, Jingkuan Song, Lianli Gao, Feng Zheng, Heng Tao Shen

Modeling latent variables with priors and hyperpriors is an essential problem in variational image compression.

Image Compression Quantization

106

Paper
Code

Error-based Knockoffs Inference for Controlled Feature Selection

no code implementations • 9 Mar 2022 • Xuebin Zhao, Hong Chen, Yingjie Wang, Weifu Li, Tieliang Gong, Yulong Wang, Feng Zheng

Recently, the scheme of model-X knockoffs was proposed as a promising solution to address controlled feature selection under high-dimensional finite-sample settings.

Feature Importance feature selection

Paper
Add Code

Skating-Mixer: Long-Term Sport Audio-Visual Modeling with MLPs

1 code implementation • 8 Mar 2022 • Jingfei Xia, Mingchen Zhuge, Tiantian Geng, Shun Fan, Yuantai Wei, Zhenyu He, Feng Zheng

Figure skating scoring is challenging because it requires judging the technical moves of the players as well as their coordination with the background music.

Representation Learning

Paper
Code

Class-Aware Contrastive Semi-Supervised Learning

1 code implementation • CVPR 2022 • Fan Yang, Kai Wu, Shuyi Zhang, Guannan Jiang, Yong liu, Feng Zheng, Wei zhang, Chengjie Wang, Long Zeng

Pseudo-label-based semi-supervised learning (SSL) has achieved great success on raw data utilization.

Ranked #1 on Semi-Supervised Image Classification on CIFAR-100 (250 Labels, ImageNet-100 Unlabeled)

Pseudo Label Semi-Supervised Image Classification

Paper
Code

A Survey of Visual Sensory Anomaly Detection

1 code implementation • 14 Feb 2022 • Xi Jiang, Guoyang Xie, Jinbao Wang, Yong liu, Chengjie Wang, Feng Zheng, Yaochu Jin

In this survey, we are the first one to provide a comprehensive review of visual sensory AD and category into three levels according to the form of anomalies.

Anomaly Detection

Paper
Code

Cross-Modality Neuroimage Synthesis: A Survey

no code implementations • 14 Feb 2022 • Guoyang Xie, Yawen Huang, Jinbao Wang, Jiayi Lyu, Feng Zheng, Yefeng Zheng, Yaochu Jin

This is followed by a stepwise in-depth analysis to evaluate how cross-modality neuroimage synthesis improves the performance of its downstream tasks.

Image Generation Weakly-supervised Learning

Paper
Add Code

FedMed-ATL: Misaligned Unpaired Brain Image Synthesis via Affine Transform Loss

1 code implementation • 29 Jan 2022 • Jinbao Wang, Guoyang Xie, Yawen Huang, Yefeng Zheng, Yaochu Jin, Feng Zheng

The proposed method demonstrates the advanced performance in both the quality of our synthesized results under a severely misaligned and unpaired data setting, and better stability than other GAN-based algorithms.

Data Augmentation Image Generation +1

Paper
Code

FedMed-GAN: Federated Domain Translation on Unsupervised Cross-Modality Brain Image Synthesis

1 code implementation • 22 Jan 2022 • Jinbao Wang, Guoyang Xie, Yawen Huang, Jiayi Lyu, Yefeng Zheng, Feng Zheng, Yaochu Jin

There is a clear need to launch a federated learning and facilitate the integration of the dispersed data from different institutions.

Federated Learning Image Generation +1

Paper
Code

Meta Distribution Alignment for Generalizable Person Re-Identification

1 code implementation • CVPR 2022 • Hao Ni, Jingkuan Song, Xiaopeng Luo, Feng Zheng, Wen Li, Heng Tao Shen

Domain Generalizable (DG) person ReID is a challenging task which trains a model on source domains yet generalizes well on target domains.

Generalizable Person Re-identification Meta-Learning +1

Paper
Code

GuidedMix-Net: Semi-supervised Semantic Segmentation by Using Labeled Images as Reference

no code implementations • 28 Dec 2021 • Peng Tu, Yawen Huang, Feng Zheng, Zhenyu He, Liujun Cao, Ling Shao

In this paper, we propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net, by leveraging labeled information to guide the learning of unlabeled instances.

Segmentation Semi-Supervised Semantic Segmentation

Paper
Add Code

Benchmarks for Corruption Invariant Person Re-identification

1 code implementation • 1 Nov 2021 • Minghui Chen, Zhiqiang Wang, Feng Zheng

When deploying person re-identification (ReID) model in safety-critical applications, it is pivotal to understanding the robustness of the model against a diverse array of image corruptions.

Ranked #1 on Cross-Modal Person Re-Identification on RegDB-C (mINP (Visible to Thermal) metric)

Cross-Modal Person Re-Identification Generalizable Person Re-identification

Paper
Code

DepthTrack : Unveiling the Power of RGBD Tracking

1 code implementation • 31 Aug 2021 • Song Yan, Jinyu Yang, Jani Käpylä, Feng Zheng, Aleš Leonardis, Joni-Kristian Kämäräinen

RGBD (RGB plus depth) object tracking is gaining momentum as RGBD sensors have become popular in many application fields such as robotics. However, the best RGBD trackers are extensions of the state-of-the-art deep RGB trackers.

Object Tracking

Paper
Code

Seminar Learning for Click-Level Weakly Supervised Semantic Segmentation

no code implementations • ICCV 2021 • Hongjun Chen, Jinbao Wang, Hong Cai Chen, XianTong Zhen, Feng Zheng, Rongrong Ji, Ling Shao

Annotation burden has become one of the biggest barriers to semantic segmentation.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Paper
Add Code

End-to-End Dense Video Captioning with Parallel Decoding

2 code implementations • ICCV 2021 • Teng Wang, Ruimao Zhang, Zhichao Lu, Feng Zheng, Ran Cheng, Ping Luo

Dense video captioning aims to generate multiple associated captions with their temporal locations from the video.

Ranked #5 on Dense Video Captioning on YouCook2

Caption Generation Dense Video Captioning

192

Paper
Code

An Information-theoretic Perspective of Hierarchical Clustering

no code implementations • 13 Aug 2021 • YiCheng Pan, Feng Zheng, Bingchen Fan

In this paper, we investigate hierarchical clustering from the \emph{information-theoretic} perspective and formulate a new objective function.

Clustering

Paper
Add Code

Saliency-Associated Object Tracking

1 code implementation • ICCV 2021 • Zikun Zhou, Wenjie Pei, Xin Li, Hongpeng Wang, Feng Zheng, Zhenyu He

A potential limitation of such trackers is that not all patches are equally informative for tracking.

Object Object Tracking

Paper
Code

FREE: Feature Refinement for Generalized Zero-Shot Learning

1 code implementation • ICCV 2021 • Shiming Chen, Wenjie Wang, Beihao Xia, Qinmu Peng, Xinge You, Feng Zheng, Ling Shao

FREE employs a feature refinement (FR) module that incorporates \textit{semantic$\rightarrow$visual} mapping into a unified generative model to refine the visual features of seen and unseen class samples.

Generalized Zero-Shot Learning

Paper
Code

WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations

no code implementations • 7 Jul 2021 • Peidong Liu, Zibin He, Xiyu Yan, Yong Jiang, Shutao Xia, Feng Zheng, Maowei Hu

In this work, we propose an effective weakly-supervised video semantic segmentation pipeline with click annotations, called WeClick, for saving laborious annotating effort by segmenting an instance of the semantic class with only a single click.

Knowledge Distillation Model Compression +3

Paper
Add Code

GuidedMix-Net: Learning to Improve Pseudo Masks Using Labeled Images as Reference

1 code implementation • 29 Jun 2021 • Peng Tu, Yawen Huang, Rongrong Ji, Feng Zheng, Ling Shao

To take advantage of the labeled examples and guide unlabeled data learning, we further propose a mask generation module to generate high-quality pseudo masks for the unlabeled data.

Ranked #1 on Semi-Supervised Semantic Segmentation on PASCAL VOC 2012 500 labels

Semi-Supervised Semantic Segmentation

Paper
Code

Learning 3D Shape Feature for Texture-Insensitive Person Re-Identification

1 code implementation • CVPR 2021 • Jiaxing Chen, Xinyang Jiang, Fudong Wang, Jun Zhang, Feng Zheng, Xing Sun, Wei-Shi Zheng

In this paper, rather than relying on texture based information, we propose to improve the robustness of person ReID against clothing texture by exploiting the information of a person's 3D shape.

Ranked #4 on Person Re-Identification on PRCC

3D Reconstruction Person Re-Identification

Paper
Code

Brain Image Synthesis With Unsupervised Multivariate Canonical CSCl4Net

no code implementations • CVPR 2021 • Yawen Huang, Feng Zheng, Danyang Wang, Weilin Huang, Matthew R. Scott, Ling Shao

Recent advances in neuroscience have highlighted the effectiveness of multi-modal medical data for investigating certain pathologies and understanding human cognition.

Image Generation

Paper
Add Code

Brain Image Synthesis with Unsupervised Multivariate Canonical CSC$\ell_4$Net

no code implementations • 22 Mar 2021 • Yawen Huang, Feng Zheng, Danyang Wang, Weilin Huang, Matthew R. Scott, Ling Shao

Recent advances in neuroscience have highlighted the effectiveness of multi-modal medical data for investigating certain pathologies and understanding human cognition.

Image Generation

Paper
Add Code

Tiny Adversarial Mulit-Objective Oneshot Neural Architecture Search

no code implementations • 28 Feb 2021 • Guoyang Xie, Jinbao Wang, Guo Yu, Feng Zheng, Yaochu Jin

Our work focuses on how to improve the robustness of tiny neural networks without seriously deteriorating of clean accuracy under mobile-level resources.

Neural Architecture Search

Paper
Add Code

A Bayesian Federated Learning Framework with Online Laplace Approximation

no code implementations • 3 Feb 2021 • Liangxi Liu, Xi Jiang, Feng Zheng, Hong Chen, Guo-Jun Qi, Heng Huang, Ling Shao

On the client side, a prior loss that uses the global posterior probabilistic parameters delivered from the server is designed to guide the local training.

Federated Learning

Paper
Add Code

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search

2 code implementations • 8 Jan 2021 • Chenyang Gao, Guanyu Cai, Xinyang Jiang, Feng Zheng, Jun Zhang, Yifei Gong, Pai Peng, Xiaowei Guo, Xing Sun

Secondly, a BERT with locality-constrained attention is proposed to obtain representations of descriptions at different scales.

Ranked #15 on Text based Person Retrieval on CUHK-PEDES

Descriptive Sentence +2

Paper
Code

DepthTrack: Unveiling the Power of RGBD Tracking

1 code implementation • ICCV 2021 • Song Yan, Jinyu Yang, Jani Kapyla, Feng Zheng, Ales Leonardis, Joni-Kristian Kamarainen

This can be explained by the fact that there are no sufficiently large RGBD datasets to 1) train "deep depth trackers" and to 2) challenge RGB trackers with sequences for which the depth cue is essential.

Object Tracking

Paper
Code

One for More: Selecting Generalizable Samples for Generalizable ReID Model

1 code implementation • 10 Dec 2020 • Enwei Zhang, Xinyang Jiang, Hao Cheng, AnCong Wu, Fufu Yu, Ke Li, Xiaowei Guo, Feng Zheng, Wei-Shi Zheng, Xing Sun

Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss of the model decreases on selected training batch, with no regards to the performance on samples outside the batch.

Person Re-Identification

Paper
Code

Multi-task Additive Models for Robust Estimation and Automatic Structure Discovery

no code implementations • NeurIPS 2020 • Yingjie Wang, Hong Chen, Feng Zheng, Chen Xu, Tieliang Gong, Yanhong Chen

For high-dimensional observations in real environment, e. g., Coronal Mass Ejections (CMEs) data, the learning performance of previous methods may be degraded seriously due to the complex non-Gaussian noise and the insufficiency of prior knowledge on variable structure.

Additive models Bilevel Optimization +1

Paper
Add Code

A Parallel Down-Up Fusion Network for Salient Object Detection in Optical Remote Sensing Images

no code implementations • 2 Oct 2020 • Chongyi Li, Runmin Cong, Chunle Guo, Hua Li, Chunjie Zhang, Feng Zheng, Yao Zhao

In this paper, we propose a novel Parallel Down-up Fusion network (PDF-Net) for SOD in optical RSIs, which takes full advantage of the in-path low- and high-level features and cross-path multi-resolution features to distinguish diversely scaled salient objects and suppress the cluttered backgrounds.

object-detection Object Detection +1

Paper
Add Code

Devil's in the Details: Aligning Visual Clues for Conditional Embedding in Person Re-Identification

1 code implementation • 11 Sep 2020 • Fufu Yu, Xinyang Jiang, Yifei Gong, Shizhen Zhao, Xiaowei Guo, Wei-Shi Zheng, Feng Zheng, Xing Sun

Secondly, the Conditional Feature Embedding requires the overall feature of a query image to be dynamically adjusted based on the gallery image it matches, while most of the existing methods ignore the reference images.

Ranked #1 on Person Re-Identification on CUHK03-C

Person Re-Identification

Paper
Code

LSOTB-TIR:A Large-Scale High-Diversity Thermal Infrared Object Tracking Benchmark

1 code implementation • 3 Aug 2020 • Qiao Liu, Xin Li, Zhenyu He, Chenglong Li, Jun Li, Zikun Zhou, Di Yuan, Jing Li, Kai Yang, Nana Fan, Feng Zheng

We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance.

Thermal Infrared Object Tracking Vocal Bursts Intensity Prediction

116

Paper
Code

Dual Distribution Alignment Network for Generalizable Person Re-Identification

1 code implementation • 27 Jul 2020 • Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji

Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.

Domain Generalization Generalizable Person Re-identification

Paper
Code

3dDepthNet: Point Cloud Guided Depth Completion Network for Sparse Depth and Single Color Image

no code implementations • 20 Mar 2020 • Rui Xiang, Feng Zheng, Huapeng Su, Zhe Zhang

In this paper, we propose an end-to-end deep learning network named 3dDepthNet, which produces an accurate dense depth image from a single pair of sparse LiDAR depth and color image for robotics and autonomous driving tasks.

Autonomous Driving Decoder +2

Paper
Add Code

Viewpoint-Aware Loss with Angular Regularization for Person Re-Identification

1 code implementation • 3 Dec 2019 • Zhihui Zhu, Xinyang Jiang, Feng Zheng, Xiaowei Guo, Feiyue Huang, Wei-Shi Zheng, Xing Sun

Instead of one subspace for each viewpoint, our method projects the feature from different viewpoints into a unified hypersphere and effectively models the feature distribution on both the identity-level and the viewpoint-level.

Ranked #5 on Person Re-Identification on Market-1501 (using extra training data)

Person Re-Identification

Paper
Code

Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect

2 code implementations • 28 Nov 2019 • Xinyang Jiang, Yifei Gong, Xiaowei Guo, Qize Yang, Feiyue Huang, Wei-Shi Zheng, Feng Zheng, Xing Sun

Recently, the research interest of person re-identification (ReID) has gradually turned to video-based methods, which acquire a person representation by aggregating frame features of an entire video.

Video-Based Person Re-Identification

Paper
Code

Supervised Online Hashing via Similarity Distribution Learning

no code implementations • 31 May 2019 • Mingbao Lin, Rongrong Ji, Shen Chen, Feng Zheng, Xiaoshuai Sun, Baochang Zhang, Liujuan Cao, Guodong Guo, Feiyue Huang

In this paper, we propose to model the similarity distributions between the input data and the hashing codes, upon which a novel supervised online hashing method, dubbed as Similarity Distribution based Online Hashing (SDOH), is proposed, to keep the intrinsic semantic relationship in the produced Hamming space.

Retrieval

Paper
Add Code

Deep Spectral Clustering using Dual Autoencoder Network

no code implementations • CVPR 2019 • Xu Yang, Cheng Deng, Feng Zheng, Junchi Yan, Wei Liu

In this paper, we propose a joint learning framework for discriminative embedding and spectral clustering.

Clustering Deep Clustering +1

Paper
Add Code

Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training

1 code implementation • CVPR 2019 • Feng Zheng, Cheng Deng, Xing Sun, Xinyang Jiang, Xiaowei Guo, Zongqiao Yu, Feiyue Huang, Rongrong Ji

Most existing Re-IDentification (Re-ID) methods are highly dependent on precise bounding boxes that enable images to be aligned with each other.

Ranked #2 on Person Re-Identification on CUHK03-C

Person Re-Identification

Paper
Code

Unsupervised Deep Generative Adversarial Hashing Network

no code implementations • CVPR 2018 • Kamran Ghasedi Dizaji, Feng Zheng, Najmeh Sadoughi, Yanhua Yang, Cheng Deng, Heng Huang

HashGAN consists of three networks, a generator, a discriminator and an encoder.

Clustering Image Clustering +2

Paper
Add Code

Trifo-VIO: Robust and Efficient Stereo Visual Inertial Odometry using Points and Lines

no code implementations • 6 Mar 2018 • Feng Zheng, Grace Tsai, Zhe Zhang, Shaoshan Liu, Chen-Chi Chu, Hongbing Hu

In this paper, we present the Trifo Visual Inertial Odometry (Trifo-VIO), a tightly-coupled filtering-based stereo VIO system using both points and lines.

Paper
Add Code

PIRVS: An Advanced Visual-Inertial SLAM System with Flexible Sensor Fusion and Hardware Co-Design

no code implementations • 2 Oct 2017 • Zhe Zhang, Shaoshan Liu, Grace Tsai, Hongbing Hu, Chen-Chi Chu, Feng Zheng

In this paper, we present the PerceptIn Robotics Vision System (PIRVS) system, a visual-inertial computing hardware with embedded simultaneous localization and mapping (SLAM) algorithm.

Sensor Fusion Simultaneous Localization and Mapping

Paper
Add Code

Dual-reference Face Retrieval

no code implementations • 2 Jun 2017 • BingZhang Hu, Feng Zheng, Ling Shao

Face retrieval has received much attention over the past few decades, and many efforts have been made in retrieving face images against pose, illumination, and expression variations.

Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.