Search Results for author: Fei Yin

Found 30 papers, 14 papers with code

Ensemble Quadratic Assignment Network for Graph Matching

no code implementations • 11 Mar 2024 • Haoru Tan, Chuang Wang, Sitong Wu, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu

In this paper, we propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.

3D Shape Classification Graph Matching +1

Paper
Add Code

GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving

1 code implementation • 15 Feb 2024 • Jiaxin Zhang, Zhongzhi Li, Mingliang Zhang, Fei Yin, ChengLin Liu, Yashar Moshfeghi

To address this gap, we introduce the GeoEval benchmark, a comprehensive collection that includes a main subset of 2, 000 problems, a 750 problems subset focusing on backward reasoning, an augmented subset of 2, 000 problems, and a hard subset of 300 problems.

Geometry Problem Solving Math

Paper
Code

LANS: A Layout-Aware Neural Solver for Plane Geometry Problem

no code implementations • 25 Nov 2023 • Zhong-Zhi Li, Ming-Liang Zhang, Fei Yin, Cheng-Lin Liu

Existing neural solvers take GPS as a vision-language task but are short in the representation of geometry diagrams that carry rich and complex layout information.

Geometry Problem Solving Language Modelling

Paper
Add Code

ToonTalker: Cross-Domain Face Reenactment

no code implementations • ICCV 2023 • Yuan Gong, Yong Zhang, Xiaodong Cun, Fei Yin, Yanbo Fan, Xuan Wang, Baoyuan Wu, Yujiu Yang

Moreover, since no paired data is provided, we propose a novel cross-domain training scheme using data from two domains with the designed analogy constraint.

Face Reenactment Talking Face Generation

Paper
Add Code

NOFA: NeRF-based One-shot Facial Avatar Reconstruction

no code implementations • 7 Jul 2023 • Wangbo Yu, Yanbo Fan, Yong Zhang, Xuan Wang, Fei Yin, Yunpeng Bai, Yan-Pei Cao, Ying Shan, Yang Wu, Zhongqian Sun, Baoyuan Wu

In this work, we propose a one-shot 3D facial avatar reconstruction framework that only requires a single source image to reconstruct a high-fidelity 3D facial avatar.

Decoder

Paper
Add Code

Efficient Text-Guided 3D-Aware Portrait Generation with Score Distillation Sampling on Distribution

no code implementations • 3 Jun 2023 • Yiji Cheng, Fei Yin, Xiaoke Huang, Xintong Yu, Jiaxiang Liu, Shikun Feng, Yujiu Yang, Yansong Tang

These elaborated designs enable our model to generate portraits with robust multi-view semantic consistency, eliminating the need for optimization-based methods.

Text to 3D

Paper
Add Code

Accelerating Diffusion Models for Inverse Problems through Shortcut Sampling

1 code implementation • 26 May 2023 • Gongye Liu, Haoze Sun, Jiayi Li, Fei Yin, Yujiu Yang

To derive the transitional state during the forward process, we introduce Distortion Adaptive Inversion.

Colorization Deblurring +1

Paper
Code

A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram

1 code implementation • 22 Feb 2023 • Ming-Liang Zhang, Fei Yin, Cheng-Lin Liu

Geometry problem solving (GPS) is a high-level mathematical reasoning requiring the capacities of multi-modal fusion and geometric knowledge application.

Ranked #1 on Mathematical Reasoning on PGPS9K

Geometry Problem Solving

Paper
Code

Visual Traffic Knowledge Graph Generation from Scene Images

no code implementations • ICCV 2023 • Yunfei Guo, Fei Yin, Xiao-Hui Li, Xudong Yan, Tao Xue, Shuqi Mei, Cheng-Lin Liu

Although previous works on traffic scene understanding have achieved great success, most of them stop at a lowlevel perception stage, such as road segmentation and lane detection, and few concern high-level understanding.

Graph Attention Graph Generation +4

Paper
Add Code

Generalizable Black-Box Adversarial Attack with Meta Learning

1 code implementation • 1 Jan 2023 • Fei Yin, Yong Zhang, Baoyuan Wu, Yan Feng, Jingyi Zhang, Yanbo Fan, Yujiu Yang

In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget.

Adversarial Attack Meta-Learning

Paper
Code

3D GAN Inversion with Facial Symmetry Prior

no code implementations • CVPR 2023 • Fei Yin, Yong Zhang, Xuan Wang, Tengfei Wang, Xiaoyu Li, Yuan Gong, Yanbo Fan, Xiaodong Cun, Ying Shan, Cengiz Oztireli, Yujiu Yang

It is natural to associate 3D GANs with GAN inversion methods to project a real image into the generator's latent space, allowing free-view consistent synthesis and editing, referred as 3D GAN inversion.

Image Reconstruction Neural Rendering

Paper
Add Code

VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

1 code implementation • 27 Nov 2022 • Kun Cheng, Xiaodong Cun, Yong Zhang, Menghan Xia, Fei Yin, Mingrui Zhu, Xuan Wang, Jue Wang, Nannan Wang

Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-realism.

Video Editing Video Generation

5,890

Paper
Code

MobileCodec: Neural Inter-frame Video Compression on Mobile Devices

no code implementations • 18 Jul 2022 • Hoang Le, Liang Zhang, Amir Said, Guillaume Sautiere, Yang Yang, Pranav Shrestha, Fei Yin, Reza Pourreza, Auke Wiggers

Realizing the potential of neural video codecs on mobile devices is a big technological challenge due to the computational complexity of deep networks and the power-constrained mobile hardware.

Decoder Video Compression

Paper
Add Code

Learning Quality-aware Dynamic Memory for Video Object Segmentation

1 code implementation • 16 Jul 2022 • Yong liu, Ran Yu, Fei Yin, Xinyuan Zhao, Wei Zhao, Weihao Xia, Yujiu Yang

However, they mainly focus on better matching between the current frame and the memory frames without explicitly paying attention to the quality of the memory.

Ranked #11 on Semi-Supervised Video Object Segmentation on DAVIS 2016 (using extra training data)

Segmentation Semantic Segmentation +2

139

Paper
Code

PGDP5K: A Diagram Parsing Dataset for Plane Geometry Problems

1 code implementation • 20 May 2022 • Yihan Hao, Mingliang Zhang, Fei Yin, Linlin Huang

An appropriate dataset is critical for the research of PGDP.

Geometry Problem Solving

Paper
Code

Plane Geometry Diagram Parsing

1 code implementation • 19 May 2022 • Ming-Liang Zhang, Fei Yin, Yi-Han Hao, Cheng-Lin Liu

Geometry diagram parsing plays a key role in geometry problem solving, wherein the primitive extraction and relation parsing remain challenging due to the complex layout and between-primitive relationship.

Ranked #1 on Scene Parsing on PGDP5K

Geometry Problem Solving Graph Neural Network +5

Paper
Code

Document Dewarping with Control Points

1 code implementation • 20 Mar 2022 • Guo-Wang Xie, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu

In this paper, we propose a simple yet effective approach to rectify distorted document image by estimating control points and reference points.

Optical Character Recognition (OCR)

147

Paper
Code

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN

1 code implementation • 8 Mar 2022 • Fei Yin, Yong Zhang, Xiaodong Cun, Mingdeng Cao, Yanbo Fan, Xuan Wang, Qingyan Bai, Baoyuan Wu, Jue Wang, Yujiu Yang

Our framework elevates the resolution of the synthesized talking face to 1024*1024 for the first time, even though the training dataset has a lower resolution.

Facial Editing Talking Face Generation +1

608

Paper
Code

Real-Time, Accurate, and Consistent Video Semantic Segmentation via Unsupervised Adaptation and Cross-Unit Deployment on Mobile Device

no code implementations • CVPR 2022 • Hyojin Park, Alan Yessenbayev, Tushar Singhal, Navin Kumar Adhikari, Yizhe Zhang, Shubhankar Mangesh Borse, Hong Cai, Nilesh Prasad Pandey, Fei Yin, Frank Mayer, Balaji Calidas, Fatih Porikli

Such a deployment scheme best utilizes the available processing power on the smartphone and enables real-time operation of our adaptive video segmentation algorithm.

Segmentation Semantic Segmentation +2

Paper
Add Code

Identity-guided Face Generation with Multi-modal Contour Conditions

no code implementations • 10 Oct 2021 • Qingyan Bai, Weihao Xia, Fei Yin, Yujiu Yang

Concretely, we propose a novel dual-encoder architecture, in which an identity encoder extracts the identity-related feature, accompanied by a main encoder to obtain the rough contour information and further fuse all the information together.

Face Generation Image Restoration

Paper
Add Code

Semantic-Aware Video Text Detection

no code implementations • CVPR 2021 • Wei Feng, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu

To overcome the lack of character-level annotations, we propose a novel weakly-supervised character center detection module, which only uses word-level annotated real images to generate character-level labels.

Text Detection

Paper
Add Code

Prototype Augmentation and Self-Supervision for Incremental Learning

1 code implementation • CVPR 2021 • Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, Cheng-Lin Liu

Despite the impressive performance in many individual tasks, deep neural networks suffer from catastrophic forgetting when learning new tasks incrementally.

Incremental Learning Self-Supervised Learning

720

Paper
Code

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

1 code implementation • 14 Apr 2021 • Guo-Wang Xie, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu

As camera-based documents are increasingly used, the rectification of distorted document images becomes a need to improve the recognition performance.

153

Paper
Code

Weakly-Supervised Arbitrary-Shaped Text Detection with Expectation-Maximization Algorithm

no code implementations • 1 Dec 2020 • Mengbiao Zhao, Wei Feng, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu

We propose an Expectation-Maximization (EM) based weakly-supervised learning framework to train an accurate arbitrary-shaped text detector using only a small amount of polygon-level annotated data combined with a large amount of weakly annotated data.

Text Detection Weakly-supervised Learning

Paper
Add Code

Multi-task Layout Analysis for Historical Handwritten Documents Using Fully Convolutional Networks

no code implementations • International Joint Conference on Artificial Intelligence 2018 • Yue Xu, Fei Yin, Zhaoxiang Zhang, Cheng-Lin Liu

Layout analysis is a fundamental process in document image analysis and understanding.

Paper
Add Code

SCAN: Sliding Convolutional Attention Network for Scene Text Recognition

no code implementations • 2 Jun 2018 • Yi-Chao Wu, Fei Yin, Xu-Yao Zhang, Li Liu, Cheng-Lin Liu

Scene text recognition has drawn great attentions in the community of computer vision and artificial intelligence due to its challenges and wide applications.

Scene Text Recognition

Paper
Add Code

Robust Classification with Convolutional Prototype Learning

3 code implementations • CVPR 2018 • Hong-Ming Yang, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu

To improve the robustness, we propose a novel learning framework called convolutional prototype learning (CPL).

Classification General Classification +2

124

Paper
Code

Scene Text Recognition with Sliding Convolutional Character Models

no code implementations • 6 Sep 2017 • Fei Yin, Yi-Chao Wu, Xu-Yao Zhang, Cheng-Lin Liu

In this paper, we investigate the intrinsic characteristics of text recognition, and inspired by human cognition mechanisms in reading texts, we propose a scene text recognition method with character models on convolutional feature map.

Scene Text Recognition

Paper
Add Code

Deep Direct Regression for Multi-Oriented Scene Text Detection

no code implementations • ICCV 2017 • Wenhao He, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu

To verify this point of view, we propose a deep direct regression based method for multi-oriented scene text detection.

Multi-Oriented Scene Text Detection object-detection +3

Paper
Add Code

Drawing and Recognizing Chinese Characters with Recurrent Neural Network

1 code implementation • 21 Jun 2016 • Xu-Yao Zhang, Fei Yin, Yan-Ming Zhang, Cheng-Lin Liu, Yoshua Bengio

In this paper, we propose a framework by using the recurrent neural network (RNN) as both a discriminative model for recognizing Chinese characters and a generative model for drawing (generating) Chinese characters.

Handwriting Recognition

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.