no code implementations • 11 Mar 2024 • Haoru Tan, Chuang Wang, Sitong Wu, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu
In this paper, we propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
1 code implementation • 15 Feb 2024 • Jiaxin Zhang, Zhongzhi Li, Mingliang Zhang, Fei Yin, ChengLin Liu, Yashar Moshfeghi
To address this gap, we introduce the GeoEval benchmark, a comprehensive collection that includes a main subset of 2, 000 problems, a 750 problems subset focusing on backward reasoning, an augmented subset of 2, 000 problems, and a hard subset of 300 problems.
no code implementations • 25 Nov 2023 • Zhong-Zhi Li, Ming-Liang Zhang, Fei Yin, Cheng-Lin Liu
Existing neural solvers take GPS as a vision-language task but are short in the representation of geometry diagrams that carry rich and complex layout information.
no code implementations • ICCV 2023 • Yuan Gong, Yong Zhang, Xiaodong Cun, Fei Yin, Yanbo Fan, Xuan Wang, Baoyuan Wu, Yujiu Yang
Moreover, since no paired data is provided, we propose a novel cross-domain training scheme using data from two domains with the designed analogy constraint.
no code implementations • 7 Jul 2023 • Wangbo Yu, Yanbo Fan, Yong Zhang, Xuan Wang, Fei Yin, Yunpeng Bai, Yan-Pei Cao, Ying Shan, Yang Wu, Zhongqian Sun, Baoyuan Wu
In this work, we propose a one-shot 3D facial avatar reconstruction framework that only requires a single source image to reconstruct a high-fidelity 3D facial avatar.
no code implementations • 3 Jun 2023 • Yiji Cheng, Fei Yin, Xiaoke Huang, Xintong Yu, Jiaxiang Liu, Shikun Feng, Yujiu Yang, Yansong Tang
These elaborated designs enable our model to generate portraits with robust multi-view semantic consistency, eliminating the need for optimization-based methods.
1 code implementation • 26 May 2023 • Gongye Liu, Haoze Sun, Jiayi Li, Fei Yin, Yujiu Yang
To derive the transitional state during the forward process, we introduce Distortion Adaptive Inversion.
1 code implementation • 22 Feb 2023 • Ming-Liang Zhang, Fei Yin, Cheng-Lin Liu
Geometry problem solving (GPS) is a high-level mathematical reasoning requiring the capacities of multi-modal fusion and geometric knowledge application.
Ranked #1 on Mathematical Reasoning on PGPS9K
no code implementations • ICCV 2023 • Yunfei Guo, Fei Yin, Xiao-Hui Li, Xudong Yan, Tao Xue, Shuqi Mei, Cheng-Lin Liu
Although previous works on traffic scene understanding have achieved great success, most of them stop at a lowlevel perception stage, such as road segmentation and lane detection, and few concern high-level understanding.
1 code implementation • 1 Jan 2023 • Fei Yin, Yong Zhang, Baoyuan Wu, Yan Feng, Jingyi Zhang, Yanbo Fan, Yujiu Yang
In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget.
no code implementations • CVPR 2023 • Fei Yin, Yong Zhang, Xuan Wang, Tengfei Wang, Xiaoyu Li, Yuan Gong, Yanbo Fan, Xiaodong Cun, Ying Shan, Cengiz Oztireli, Yujiu Yang
It is natural to associate 3D GANs with GAN inversion methods to project a real image into the generator's latent space, allowing free-view consistent synthesis and editing, referred as 3D GAN inversion.
1 code implementation • 27 Nov 2022 • Kun Cheng, Xiaodong Cun, Yong Zhang, Menghan Xia, Fei Yin, Mingrui Zhu, Xuan Wang, Jue Wang, Nannan Wang
Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-realism.
no code implementations • 18 Jul 2022 • Hoang Le, Liang Zhang, Amir Said, Guillaume Sautiere, Yang Yang, Pranav Shrestha, Fei Yin, Reza Pourreza, Auke Wiggers
Realizing the potential of neural video codecs on mobile devices is a big technological challenge due to the computational complexity of deep networks and the power-constrained mobile hardware.
1 code implementation • 16 Jul 2022 • Yong liu, Ran Yu, Fei Yin, Xinyuan Zhao, Wei Zhao, Weihao Xia, Yujiu Yang
However, they mainly focus on better matching between the current frame and the memory frames without explicitly paying attention to the quality of the memory.
Ranked #11 on Semi-Supervised Video Object Segmentation on DAVIS 2016 (using extra training data)
1 code implementation • 20 May 2022 • Yihan Hao, Mingliang Zhang, Fei Yin, Linlin Huang
An appropriate dataset is critical for the research of PGDP.
1 code implementation • 19 May 2022 • Ming-Liang Zhang, Fei Yin, Yi-Han Hao, Cheng-Lin Liu
Geometry diagram parsing plays a key role in geometry problem solving, wherein the primitive extraction and relation parsing remain challenging due to the complex layout and between-primitive relationship.
Ranked #1 on Scene Parsing on PGDP5K
1 code implementation • 20 Mar 2022 • Guo-Wang Xie, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
In this paper, we propose a simple yet effective approach to rectify distorted document image by estimating control points and reference points.
1 code implementation • 8 Mar 2022 • Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang
Our framework elevates the resolution of the synthesized talking face to 1024*1024 for the first time, even though the training dataset has a lower resolution.
no code implementations • CVPR 2022 • Hyojin Park, Alan Yessenbayev, Tushar Singhal, Navin Kumar Adhikari, Yizhe Zhang, Shubhankar Mangesh Borse, Hong Cai, Nilesh Prasad Pandey, Fei Yin, Frank Mayer, Balaji Calidas, Fatih Porikli
Such a deployment scheme best utilizes the available processing power on the smartphone and enables real-time operation of our adaptive video segmentation algorithm.
no code implementations • 10 Oct 2021 • Qingyan Bai, Weihao Xia, Fei Yin, Yujiu Yang
Concretely, we propose a novel dual-encoder architecture, in which an identity encoder extracts the identity-related feature, accompanied by a main encoder to obtain the rough contour information and further fuse all the information together.
no code implementations • CVPR 2021 • Wei Feng, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
To overcome the lack of character-level annotations, we propose a novel weakly-supervised character center detection module, which only uses word-level annotated real images to generate character-level labels.
1 code implementation • CVPR 2021 • Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, Cheng-Lin Liu
Despite the impressive performance in many individual tasks, deep neural networks suffer from catastrophic forgetting when learning new tasks incrementally.
1 code implementation • 14 Apr 2021 • Guo-Wang Xie, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
As camera-based documents are increasingly used, the rectification of distorted document images becomes a need to improve the recognition performance.
no code implementations • 1 Dec 2020 • Mengbiao Zhao, Wei Feng, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
We propose an Expectation-Maximization (EM) based weakly-supervised learning framework to train an accurate arbitrary-shaped text detector using only a small amount of polygon-level annotated data combined with a large amount of weakly annotated data.
no code implementations • International Joint Conference on Artificial Intelligence 2018 • Yue Xu, Fei Yin, Zhaoxiang Zhang, Cheng-Lin Liu
Layout analysis is a fundamental process in document image analysis and understanding.
no code implementations • 2 Jun 2018 • Yi-Chao Wu, Fei Yin, Xu-Yao Zhang, Li Liu, Cheng-Lin Liu
Scene text recognition has drawn great attentions in the community of computer vision and artificial intelligence due to its challenges and wide applications.
3 code implementations • CVPR 2018 • Hong-Ming Yang, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu
To improve the robustness, we propose a novel learning framework called convolutional prototype learning (CPL).
no code implementations • 6 Sep 2017 • Fei Yin, Yi-Chao Wu, Xu-Yao Zhang, Cheng-Lin Liu
In this paper, we investigate the intrinsic characteristics of text recognition, and inspired by human cognition mechanisms in reading texts, we propose a scene text recognition method with character models on convolutional feature map.
no code implementations • ICCV 2017 • Wenhao He, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu
To verify this point of view, we propose a deep direct regression based method for multi-oriented scene text detection.
1 code implementation • 21 Jun 2016 • Xu-Yao Zhang, Fei Yin, Yan-Ming Zhang, Cheng-Lin Liu, Yoshua Bengio
In this paper, we propose a framework by using the recurrent neural network (RNN) as both a discriminative model for recognizing Chinese characters and a generative model for drawing (generating) Chinese characters.