1 code implementation • 23 May 2024 • Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin
Our key idea is to reconstruct the diffusion training process, introducing more refined guidance tailored to this task, to expose and rectify the model's attention at the character level and strengthen its learning of text regions.
no code implementations • 7 May 2024 • Zhibo Zhang, Ximing Yang, Weizhong Zhang, Cheng Jin
Cross-modal knowledge transfer enhances point cloud representation learning in LiDAR semantic segmentation.
no code implementations • 25 Apr 2024 • Zhibo Zhang, Ximing Yang, Weizhong Zhang, Cheng Jin
We apply this robust fine-tuning method to mainstream 3D point cloud pre-trained models and evaluate the quality of model parameters and the degradation of downstream task performance.
1 code implementation • 20 Mar 2024 • LeoWu TomyEnrique, Xiangcheng Du, Kangliang Liu, Han Yuan, Zhao Zhou, Cheng Jin
Scene text image super-resolution has significantly improved the accuracy of scene text recognition.
1 code implementation • 8 Mar 2024 • Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin
This prior information is encoded into the attention weights, which are then integrated into the self-attention layers of the generator to guide the synthesis process.
no code implementations • 28 Feb 2024 • Weilin Wan, Weizhong Zhang, Quan Zhou, Fan Yi, Cheng Jin
Our neural activation prior is based on a key observation that, for a channel before the global pooling layer of a fully trained neural network, the probability of a few neurons being activated with a large response by an in-distribution (ID) sample is significantly higher than that by an OOD sample.
no code implementations • 3 Jan 2024 • Jiawei Zhang, Yufan Chen, Cheng Jin, Lei Zhu, Yuantao Gu
Out-of-distribution (OOD) detection plays a crucial role in ensuring the security of neural networks.
1 code implementation • 19 Dec 2023 • Kaiyi Zhang, Yang Chen, Ximing Yang, Weizhong Zhang, Cheng Jin
Based on this process, we introduce SGAS, a model for part editing that employs two strategies: feature disentanglement and constraint.
1 code implementation • 9 Dec 2023 • Renao Yan, Qiehe Sun, Cheng Jin, Yiqing Liu, Yonghong He, Tian Guan, Hao Chen
While most of the conventional MIL methods use attention scores to estimate instance importance scores (IIS) which contribute to the prediction of the slide labels, these often lead to skewed attention distributions and inaccuracies in identifying crucial instances.
no code implementations • 19 Nov 2023 • Weijie Li, Yitian Wan, Xingjiao Wu, Junjie Xu, Cheng Jin, Liang He
Then, to better utilize image attributes in aesthetic assessment, we propose the Unified Multi-attribute Aesthetic Assessment Framework (UMAAF) to model both absolute and relative attributes of images.
1 code implementation • 17 Nov 2023 • Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin
Specifically, we first develop two specialized pre-trained diffusion models, i. e., Text-driven Diffusion Model (TDM) and Subject-augmented Diffusion Model (SDM), for scene and person generation, respectively.
1 code implementation • 17 Nov 2023 • Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin
Layout-to-image synthesis is an emerging technique in conditional image generation.
1 code implementation • 29 Oct 2023 • Anran Wu, Luwei Xiao, Xingjiao Wu, Shuwen Yang, Junjie Xu, Zisong Zhuang, Nian Xie, Cheng Jin, Liang He
Our DCQA dataset is expected to foster research on understanding visualizations in documents, especially for scenarios that require complex reasoning for charts in the visually-rich document.
no code implementations • 15 Oct 2023 • Shuwen Yang, Anran Wu, Xingjiao Wu, Luwei Xiao, Tianlong Ma, Cheng Jin, Liang He
Firstly, utilizing compressed evidence features as input to the model results in the loss of fine-grained information within the evidence.
no code implementations • 22 Sep 2023 • Zhilei Hu, Zixuan Li, Daozhu Xu, Long Bai, Cheng Jin, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng
To comprehensively understand their intrinsic semantics, in this paper, we obtain prototype representations for each type of event relation and propose a Prototype-Enhanced Matching (ProtoEM) framework for the joint extraction of multiple kinds of event relations.
no code implementations • 10 Sep 2023 • Xiaolu Wang, Cheng Jin, Hoi-To Wai, Yuantao Gu
This paper considers a type of incremental aggregated gradient (IAG) method for large-scale distributed optimization.
1 code implementation • 27 Jul 2023 • Lingdong Kong, Yaru Niu, Shaoyuan Xie, Hanjiang Hu, Lai Xing Ng, Benoit R. Cottereau, Ding Zhao, Liangjun Zhang, Hesheng Wang, Wei Tsang Ooi, Ruijie Zhu, Ziyang Song, Li Liu, Tianzhu Zhang, Jun Yu, Mohan Jing, Pengwei Li, Xiaohua Qi, Cheng Jin, Yingfeng Chen, Jie Hou, Jie Zhang, Zhen Kan, Qiang Ling, Liang Peng, Minglei Li, Di Xu, Changpeng Yang, Yuanqi Yao, Gang Wu, Jian Kuai, Xianming Liu, Junjun Jiang, Jiamian Huang, Baojun Li, Jiale Chen, Shuang Zhang, Sun Ao, Zhenyu Li, Runze Chen, Haiyong Luo, Fang Zhao, Jingze Yu
In this paper, we summarize the winning solutions from the RoboDepth Challenge -- an academic competition designed to facilitate and advance robust OoD depth estimation.
1 code implementation • 14 Apr 2023 • Shishi Xiao, Yihan Hou, Cheng Jin, Wei Zeng
Retrieving charts from a large corpus is a fundamental task that can benefit numerous applications such as visualization recommendations. The retrieved results are expected to conform to both explicit visual attributes (e. g., chart type, colormap) and implicit user intents (e. g., design style, context information) that vary upon application scenarios.
1 code implementation • 13 Apr 2023 • Kangliang Liu, Xiangcheng Du, Sijie Liu, Yingbin Zheng, Xingjiao Wu, Cheng Jin
Transformer is beneficial for image denoising tasks since it can model long-range dependencies to overcome the limitations presented by inductive convolutional biases.
no code implementations • 22 Mar 2023 • Cheng Jin, Zhengrui Guo, Yi Lin, Luyang Luo, Hao Chen
Thus, label-efficient deep learning methods are developed to make comprehensive use of the labeled data as well as the abundance of unlabeled and weak-labeled data.
no code implementations • 25 Nov 2022 • Zhao Zhou, Xiangcheng Du, Yingbin Zheng, Cheng Jin
We present the Aggregated Text TRansformer(ATTR), which is designed to represent texts in scene images with a multi-scale self-attention mechanism.
no code implementations • 23 Jul 2022 • Xiangcheng Du, Zhao Zhou, Yingbin Zheng, Xingjiao Wu, Tianlong Ma, Cheng Jin
Scene text erasing seeks to erase text contents from scene images and current state-of-the-art text erasing models are trained on large-scale synthetic data.
no code implementations • 30 Mar 2022 • Cheng Jin, Rui-Jie Zhu, Xiao Wu, Liang-Jian Deng
Spiking Neural Networks (SNNs) have piqued researchers' interest because of their capacity to process temporal information and low power consumption.
no code implementations • 6 Feb 2022 • Kaiyi Zhang, Ximing Yang, Yuan Wu, Cheng Jin
Besides, the missing patterns are diverse in reality, but existing methods can only handle fixed ones, which means a poor generalization ability.
no code implementations • 13 Dec 2021 • Ximing Yang, Zhibo Zhang, Zhengfu He, Cheng Jin
As details are missing in most representations of structures, the lack of controllability to more information is one of the major weaknesses in structure-based controllable point cloud generation.
1 code implementation • 10 Dec 2021 • Kaiyi Zhang, Ximing Yang, Yuan Wu, Cheng Jin
The points generated by AXform do not have the strong 2-manifold constraint, which improves the generation of non-smooth surfaces.
1 code implementation • 5 Dec 2021 • Jingwen Ye, Yining Mao, Jie Song, Xinchao Wang, Cheng Jin, Mingli Song
In other words, all users may employ a model in SDB for inference, but only authorized users get access to KD from the model.
no code implementations • 27 Nov 2021 • Tianlong Ma, Xingjiao Wu, Xin Li, Xiangcheng Du, Zhao Zhou, Liang Xue, Cheng Jin
To measure the proposed image layer modeling method, we propose a manually-labeled non-Manhattan layout fine-grained segmentation dataset named FPD.
1 code implementation • 4 Dec 2020 • Zachary J. Lee, George Lee, Ted Lee, Cheng Jin, Rand Lee, Zhi Low, Daniel Chang, Christine Ortega, Steven H. Low
We describe the architecture and algorithms of the Adaptive Charging Network (ACN), which was first deployed on the Caltech campus in early 2016 and is currently operating at over 100 other sites in the United States.
no code implementations • NeurIPS 2020 • Zunlei Feng, Yongming He, Xinchao Wang, Xin Gao, Jie Lei, Cheng Jin, Mingli Song
In this paper, we introduce the One-sample Guided Object Representation Disassembling (One-GORD) method, which only requires one annotated sample for each object category to learn disassembled object representation from unannotated images.
no code implementations • 19 Oct 2020 • Zhanwei Xu, Yukun Cao, Cheng Jin, Guozhu Shao, Xiaoqing Liu, Jie zhou, Heshui Shi, Jianjiang Feng
Segmentation of infected areas in chest CT volumes is of great significance for further diagnosis and treatment of COVID-19 patients.
no code implementations • 14 Nov 2017 • Gongze Cao, Yezhou Yang, Jie Lei, Cheng Jin, Yang Liu, Mingli Song
As an effective way of metric learning, triplet loss has been widely used in many deep learning tasks, including face recognition and person-ReID, leading to many states of the arts.