Search Results for author: Wayne Wu

Found 56 papers, 37 papers with code

CosmicMan: A Text-to-Image Foundation Model for Humans

no code implementations • 1 Apr 2024 • Shikai Li, Jianglin Fu, Kaiyuan Liu, Wentao Wang, Kwan-Yee Lin, Wayne Wu

We present CosmicMan, a text-to-image foundation model specialized for generating high-fidelity human images.

Paper
Add Code

PaintHuman: Towards High-fidelity Text-to-3D Human Texturing via Denoised Score Distillation

no code implementations • 14 Oct 2023 • Jianhui Yu, Hao Zhu, Liming Jiang, Chen Change Loy, Weidong Cai, Wayne Wu

We first propose a novel score function, Denoised Score Distillation (DSD), which directly modifies the SDS by introducing negative gradient components to iteratively correct the gradient direction and generate high-quality textures.

Text to 3D text-to-3d-human +1

Paper
Add Code

Parameterization-driven Neural Implicit Surfaces Editing

no code implementations • 9 Oct 2023 • Baixin Xu, Jiangbei Hu, Fei Hou, Kwan-Yee Lin, Wayne Wu, Chen Qian, Ying He

In this paper, we present a novel neural algorithm to parameterize neural implicit surfaces to simple parametric domains, such as spheres, cubes, or polycubes, thereby facilitating visualization and various editing tasks.

Neural Rendering

Paper
Add Code

OrthoPlanes: A Novel Representation for Better 3D-Awareness of GANs

1 code implementation • ICCV 2023 • Honglin He, Zhuoqian Yang, Shikai Li, Bo Dai, Wayne Wu

We present a new method for generating realistic and view-consistent images with fine geometry from 2D image collections.

226

Paper
Code

Innovative Digital Storytelling with AIGC: Exploration and Discussion of Recent Advances

no code implementations • 25 Sep 2023 • Rongzhang Gu, Hui Li, Changyue Su, Wayne Wu

Digital storytelling, as an art form, has struggled with cost-quality balance.

Paper
Add Code

UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation

1 code implementation • ICCV 2023 • Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Wayne Wu, Ziwei Liu

A holistic human dataset inevitably has insufficient and low-resolution information on local parts.

Paper
Code

ReliTalk: Relightable Talking Portrait Generation from a Single Video

1 code implementation • 5 Sep 2023 • Haonan Qiu, Zhaoxi Chen, Yuming Jiang, Hang Zhou, Xiangyu Fan, Lei Yang, Wayne Wu, Ziwei Liu

Our key insight is to decompose the portrait's reflectance from implicitly learned audio-driven facial normals and images.

Single-Image Portrait Relighting

105

Paper
Code

Audio-Driven Dubbing for User Generated Contents via Style-Aware Semi-Parametric Synthesis

no code implementations • 31 Aug 2023 • Linsen Song, Wayne Wu, Chaoyou Fu, Chen Change Loy, Ran He

Existing automated dubbing methods are usually designed for Professionally Generated Content (PGC) production, which requires massive training data and training time to learn a person-specific audio-video mapping.

Paper
Add Code

Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis

no code implementations • ICCV 2023 • Yuxin Wang, Wayne Wu, Dan Xu

State-of-the-art methods in this direction typically consider building separate networks for these two tasks (i. e., view synthesis and editing).

Novel View Synthesis

Paper
Add Code

DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering

1 code implementation • ICCV 2023 • Wei Cheng, Ruixiang Chen, Wanqi Yin, Siming Fan, Keyu Chen, Honglin He, Huiwen Luo, Zhongang Cai, Jingbo Wang, Yang Gao, Zhengming Yu, Zhengyu Lin, Daxuan Ren, Lei Yang, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Bo Dai, Kwan-Yee Lin

Realistic human-centric rendering plays a key role in both computer vision and computer graphics.

Camera Calibration Novel View Synthesis

203

Paper
Code

RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars

1 code implementation • NeurIPS 2023 • Dongwei Pan, Long Zhuo, Jingtan Piao, Huiwen Luo, Wei Cheng, Yuxin Wang, Siming Fan, Shengqi Liu, Lei Yang, Bo Dai, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Kwan-Yee Lin

It is a large-scale digital library for head avatars with three key attributes: 1) High Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K cameras in 360 degrees.

2k Image Matting +2

220

Paper
Code

HyperStyle3D: Text-Guided 3D Portrait Stylization via Hypernetworks

no code implementations • 19 Apr 2023 • Zhuo Chen, Xudong Xu, Yichao Yan, Ye Pan, Wenhan Zhu, Wayne Wu, Bo Dai, Xiaokang Yang

While the use of 3D-aware GANs bypasses the requirement of 3D data, we further alleviate the necessity of style images with the CLIP model being the stylization guidance.

Attribute

Paper
Add Code

Text2Performer: Text-Driven Human Video Generation

1 code implementation • ICCV 2023 • Yuming Jiang, Shuai Yang, Tong Liang Koh, Wayne Wu, Chen Change Loy, Ziwei Liu

In this work, we present Text2Performer to generate vivid human videos with articulated motions from texts.

Video Generation

305

Paper
Code

MonoHuman: Animatable Human Neural Field from Monocular Video

1 code implementation • CVPR 2023 • Zhengming Yu, Wei Cheng, Xian Liu, Wayne Wu, Kwan-Yee Lin

Recent works propose to graft a deformation network into the NeRF to further model the dynamics of the human neural field for animating vivid human motions.

124

Paper
Code

SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling

1 code implementation • ICCV 2023 • Zhitao Yang, Zhongang Cai, Haiyi Mei, Shuai Liu, Zhaoxi Chen, Weiye Xiao, Yukun Wei, Zhongfei Qing, Chen Wei, Bo Dai, Wayne Wu, Chen Qian, Dahua Lin, Ziwei Liu, Lei Yang

Synthetic data has emerged as a promising source for 3D human research as it offers low-cost access to large-scale human datasets.

Human Mesh Recovery Neural Rendering

179

Paper
Code

CelebV-Text: A Large-Scale Facial Text-Video Dataset

1 code implementation • CVPR 2023 • Jianhui Yu, Hao Zhu, Liming Jiang, Chen Change Loy, Weidong Cai, Wayne Wu

This paper presents CelebV-Text, a large-scale, diverse, and high-quality dataset of facial text-video pairs, to facilitate research on facial text-to-video generation tasks.

Text Generation Text-to-Video Generation +1

367

Paper
Code

OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation

1 code implementation • CVPR 2023 • Tong Wu, Jiarui Zhang, Xiao Fu, Yuxin Wang, Jiawei Ren, Liang Pan, Wayne Wu, Lei Yang, Jiaqi Wang, Chen Qian, Dahua Lin, Ziwei Liu

Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of large-scale realscanned 3D databases.

Novel View Synthesis Object +1

425

Paper
Code

3DHumanGAN: 3D-Aware Human Image Generation with 3D Pose Mapping

1 code implementation • ICCV 2023 • Zhuoqian Yang, Shikai Li, Wayne Wu, Bo Dai

We present 3DHumanGAN, a 3D-aware generative adversarial network that synthesizes photorealistic images of full-body humans with consistent appearances under different view-angles and body-poses.

Generative Adversarial Network Image Generation

226

Paper
Code

Audio-Driven Co-Speech Gesture Video Generation

no code implementations • 5 Dec 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yuanqi Du, Wayne Wu, Dahua Lin, Ziwei Liu

Our key insight is that the co-speech gestures can be decomposed into common motion patterns and subtle rhythmic dynamics.

Video Generation

Paper
Add Code

VLG: General Video Recognition with Web Textual Knowledge

1 code implementation • 3 Dec 2022 • Jintao Lin, Zhaoyang Liu, Wenhai Wang, Wayne Wu, LiMin Wang

Our VLG is first pre-trained on video and language datasets to learn a shared feature space, and then devises a flexible bi-modal attention head to collaborate high-level semantic concepts under different settings.

Video Recognition

Paper
Code

MotionBERT: A Unified Perspective on Learning Human Motion Representations

1 code implementation • ICCV 2023 • Wentao Zhu, Xiaoxuan Ma, Zhaoyang Liu, Libin Liu, Wayne Wu, Yizhou Wang

We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.

Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (using extra training data)

3D Pose Estimation Action Recognition +4

906

Paper
Code

StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3

1 code implementation • 16 Aug 2022 • Haonan Qiu, Yuming Jiang, Hang Zhou, Wayne Wu, Ziwei Liu

Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.

Image Generation Video Generation

131

Paper
Code

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset

1 code implementation • 25 Jul 2022 • Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy

Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.

Ranked #1 on Unconditional Video Generation on CelebV-HQ

Attribute Face Generation +1

361

Paper
Code

Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis

1 code implementation • 11 Jul 2022 • Long Zhuo, Guangcong Wang, Shikai Li, Wayne Wu, Ziwei Liu

In this paper, we present a spatial-temporal compression framework, \textbf{Fast-Vid2Vid}, which focuses on data aspects of generative models.

Knowledge Distillation Motion Compensation +1

156

Paper
Code

Submission to Generic Event Boundary Detection Challenge@CVPR 2022: Local Context Modeling and Global Boundary Decoding Approach

no code implementations • 30 Jun 2022 • Jiaqi Tang, Zhaoyang Liu, Jing Tan, Chen Qian, Wayne Wu, LiMin Wang

Local context modeling sub-network is proposed to perceive diverse patterns of generic event boundaries, and it generates powerful video representations and reliable boundary confidence.

Boundary Detection Generic Event Boundary Detection +1

Paper
Add Code

Text2Human: Text-Driven Controllable Human Image Generation

2 code implementations • 31 May 2022 • Yuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu

In this work, we present a text-driven controllable framework, Text2Human, for a high-quality and diverse human generation.

Human Parsing Image Generation

807

Paper
Code

EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model

no code implementations • 30 May 2022 • Xinya Ji, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Wayne Wu, Feng Xu, Xun Cao

Although significant progress has been made to audio-driven talking face generation, existing methods either neglect facial emotion or cannot be applied to arbitrary subjects.

Talking Face Generation

Paper
Add Code

Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing

2 code implementations • 25 Apr 2022 • Haoyue Cheng, Zhaoyang Liu, Hang Zhou, Chen Qian, Wayne Wu, LiMin Wang

This paper focuses on the weakly-supervised audio-visual video parsing task, which aims to recognize all events belonging to each modality and localize their temporal boundaries.

Denoising valid

Paper
Code

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

4 code implementations • 25 Apr 2022 • Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, Ziwei Liu

In addition, a model zoo and human editing applications are demonstrated to facilitate future research in the community.

Image Generation

1,108

Paper
Code

Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis

1 code implementation • 25 Apr 2022 • Wei Cheng, Su Xu, Jingtan Piao, Chen Qian, Wayne Wu, Kwan-Yee Lin, Hongsheng Li

Specifically, we compress the light fields for novel view human rendering as conditional implicit neural radiance fields from both geometry and appearance aspects.

Novel View Synthesis

174

Paper
Code

TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

1 code implementation • CVPR 2022 • Yanbo Xu, Yueqin Yin, Liming Jiang, Qianyi Wu, Chengyao Zheng, Chen Change Loy, Bo Dai, Wayne Wu

In this study, we highlight the importance of interaction in a dual-space GAN for more controllable editing.

Attribute Disentanglement +1

175

Paper
Code

Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation

1 code implementation • CVPR 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yinghao Xu, Rui Qian, Xinyi Lin, Xiaowei Zhou, Wayne Wu, Bo Dai, Bolei Zhou

To enhance the quality of synthesized gestures, we develop a contrastive learning strategy based on audio-text alignment for better audio representations.

Ranked #3 on Gesture Generation on TED Gesture Dataset

Contrastive Learning Gesture Generation

121

Paper
Code

Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation

no code implementations • 19 Jan 2022 • Xian Liu, Yinghao Xu, Qianyi Wu, Hang Zhou, Wayne Wu, Bolei Zhou

Moreover, to enable portrait rendering in one unified neural radiance field, a Torso Deformation module is designed to stabilize the large-scale non-rigid torso motions.

Paper
Add Code

MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks

no code implementations • 19 Dec 2021 • Wentao Zhu, Zhuoqian Yang, Ziang Di, Wayne Wu, Yizhou Wang, Chen Change Loy

Trained with the canonicalization operations and the derived regularizations, our method learns to factorize a skeleton sequence into three independent semantic subspaces, i. e., motion, structure, and view angle.

3D Reconstruction Action Analysis +2

Paper
Add Code

Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection

3 code implementations • CVPR 2022 • Jiaqi Tang, Zhaoyang Liu, Chen Qian, Wayne Wu, LiMin Wang

Generic event boundary detection is an important yet challenging task in video understanding, which aims at detecting the moments where humans naturally perceive event boundaries.

Boundary Detection Generic Event Boundary Detection +1

Paper
Code

Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

2 code implementations • NeurIPS 2021 • Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy

Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images.

251

Paper
Code

Pareidolia Face Reenactment

no code implementations • CVPR 2021 • Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He

We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.

Face Reenactment Texture Synthesis

Paper
Add Code

Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation

1 code implementation • CVPR 2021 • Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu

While speech content information can be defined by learning the intrinsic synchronization between audio-visual modalities, we identify that a pose code will be complementarily learned in a modulated convolution-based reconstruction framework.

Talking Face Generation

907

Paper
Code

Audio-Driven Emotional Video Portraits

1 code implementation • CVPR 2021 • Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu

In this work, we present Emotional Video Portraits (EVP), a system for synthesizing high-quality video portraits with vivid emotional dynamics driven by audios.

Disentanglement Face Generation

287

Paper
Code

Everything's Talkin': Pareidolia Face Reenactment

1 code implementation • 7 Apr 2021 • Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He

We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.

Face Reenactment Texture Synthesis

Paper
Code

DeeperForensics Challenge 2020 on Real-World Face Forgery Detection: Methods and Results

2 code implementations • 18 Feb 2021 • Liming Jiang, Zhengkui Guo, Wayne Wu, Zhaoyang Liu, Ziwei Liu, Chen Change Loy, Shuo Yang, Yuanjun Xiong, Wei Xia, Baoying Chen, Peiyu Zhuang, Sili Li, Shen Chen, Taiping Yao, Shouhong Ding, Jilin Li, Feiyue Huang, Liujuan Cao, Rongrong Ji, Changlei Lu, Ganchao Tan

This paper reports methods and results in the DeeperForensics Challenge 2020 on real-world face forgery detection.

valid

526

Paper
Code

Focal Frequency Loss for Image Reconstruction and Synthesis

1 code implementation • ICCV 2021 • Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy

In this study, we show that narrowing gaps in the frequency domain can ameliorate image reconstruction and synthesis quality further.

Ranked #6 on Image-to-Image Translation on Cityscapes Labels-to-Photo

Image Reconstruction Image-to-Image Translation

599

Paper
Code

AOT: Appearance Optimal Transport Based Identity Swapping for Forgery Detection

no code implementations • NeurIPS 2020 • Hao Zhu, Chaoyou Fu, Qianyi Wu, Wayne Wu, Chen Qian, Ran He

However, due to the lack of Deepfakes datasets with large variance in appearance, which can be hardly produced by recent identity swapping methods, the detection algorithm may fail in this situation.

Paper
Add Code

Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation

2 code implementations • ECCV 2020 • Xiaokang Chen, Kwan-Yee Lin, Jingbo Wang, Wayne Wu, Chen Qian, Hongsheng Li, Gang Zeng

Depth information has proven to be a useful cue in the semantic segmentation of RGB-D images for providing a geometric counterpart to the RGB representation.

Ranked #3 on Semantic Segmentation on Event-based Segmentation Dataset

Segmentation Semantic Segmentation +2

282

Paper
Code

TAM: Temporal Adaptive Module for Video Recognition

2 code implementations • ICCV 2021 • Zhao-Yang Liu, Li-Min Wang, Wayne Wu, Chen Qian, Tong Lu

Video data is with complex temporal dynamics due to various factors such as camera motion, speed variation, and different activities.

Action Recognition Video Recognition

193

Paper
Code

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting

no code implementations • CVPR 2020 • Zhuoqian Yang, Wentao Zhu, Wayne Wu, Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy

We present a lightweight video motion retargeting approach TransMoMo that is capable of transferring motion of a person in a source video realistically to another video of a target person.

motion retargeting

Paper
Add Code

Everybody's Talkin': Let Me Talk as You Want

no code implementations • 15 Jan 2020 • Linsen Song, Wayne Wu, Chen Qian, Ran He, Chen Change Loy

The audio-translated expression parameters are then used to synthesize a photo-realistic human subject in each video frame, with the movement of the mouth regions precisely mapped to the source audio.

3D Face Reconstruction

Paper
Add Code

DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection

1 code implementation • CVPR 2020 • Liming Jiang, Ren Li, Wayne Wu, Chen Qian, Chen Change Loy

The quality of generated videos outperforms those in existing datasets, validated by user studies.

Face Swapping Video Forensics

526

Paper
Code

FAB: A Robust Facial Landmark Detection Framework for Motion-Blurred Videos

1 code implementation • ICCV 2019 • Keqiang Sun, Wayne Wu, Tinghao Liu, Shuo Yang, Quan Wang, Qiang Zhou, Zuochang Ye, Chen Qian

A structure predictor is proposed to predict the missing face structural information temporally, which serves as a geometry prior.

Deblurring Facial Landmark Detection

Paper
Code

Make a Face: Towards Arbitrary High Fidelity Face Manipulation

no code implementations • ICCV 2019 • Shengju Qian, Kwan-Yee Lin, Wayne Wu, Yangxiaokang Liu, Quan Wang, Fumin Shen, Chen Qian, Ran He

Recent studies have shown remarkable success in face manipulation task with the advance of GANs and VAEs paradigms, but the outputs are sometimes limited to low-resolution and lack of diversity.

Clustering Disentanglement +1

Paper
Add Code

Aggregation via Separation: Boosting Facial Landmark Detector with Semi-Supervised Style Translation

1 code implementation • ICCV 2019 • Shengju Qian, Keqiang Sun, Wayne Wu, Chen Qian, Jiaya Jia

Facial landmark detection, or face alignment, is a fundamental task that has been extensively studied.

Ranked #18 on Face Alignment on WFLW

Face Alignment Facial Landmark Detection +1

182

Paper
Code

Disentangling Content and Style via Unsupervised Geometry Distillation

1 code implementation • ICLR Workshop DeepGenStruct 2019 • Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy

It is challenging to disentangle an object into two orthogonal spaces of content and style since each can influence the visual observation differently and unpredictably.

Disentanglement

Paper
Code

TransGaGa: Geometry-Aware Unsupervised Image-to-Image Translation

no code implementations • CVPR 2019 • Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy

Extensive experiments demonstrate the superior performance of our method to other state-of-the-art approaches, especially in the challenging near-rigid and non-rigid objects translation tasks.

Translation Unsupervised Image-To-Image Translation

Paper
Add Code

Unsupervised Disentangling Structure and Appearance

no code implementations • 27 Sep 2018 • Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy

It is challenging to disentangle an object into two orthogonal spaces of structure and appearance since each can influence the visual observation in a different and unpredictable way.

Disentanglement

Paper
Add Code

ReenactGAN: Learning to Reenact Faces via Boundary Transfer

1 code implementation • ECCV 2018 • Wayne Wu, Yunxuan Zhang, Cheng Li, Chen Qian, Chen Change Loy

A transformer is subsequently used to adapt the boundary of source face to the boundary of target face.

Decoder Face Reenactment +2

195

Paper
Code

Look at Boundary: A Boundary-Aware Face Alignment Algorithm

2 code implementations • CVPR 2018 • Wayne Wu, Chen Qian, Shuo Yang, Quan Wang, Yici Cai, Qiang Zhou

By utilising boundary information of 300-W dataset, our method achieves 3. 92% mean error with 0. 39% failure rate on COFW dataset, and 1. 25% mean error on AFLW-Full dataset.

Ranked #4 on Face Alignment on AFLW-19 (using extra training data)

Face Alignment Facial Landmark Detection

5,162

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.