no code implementations • 1 Apr 2024 • Shikai Li, Jianglin Fu, Kaiyuan Liu, Wentao Wang, Kwan-Yee Lin, Wayne Wu
We present CosmicMan, a text-to-image foundation model specialized for generating high-fidelity human images.
no code implementations • 14 Oct 2023 • Jianhui Yu, Hao Zhu, Liming Jiang, Chen Change Loy, Weidong Cai, Wayne Wu
We first propose a novel score function, Denoised Score Distillation (DSD), which directly modifies the SDS by introducing negative gradient components to iteratively correct the gradient direction and generate high-quality textures.
no code implementations • 9 Oct 2023 • Baixin Xu, Jiangbei Hu, Fei Hou, Kwan-Yee Lin, Wayne Wu, Chen Qian, Ying He
In this paper, we present a novel neural algorithm to parameterize neural implicit surfaces to simple parametric domains, such as spheres, cubes, or polycubes, thereby facilitating visualization and various editing tasks.
1 code implementation • ICCV 2023 • Honglin He, Zhuoqian Yang, Shikai Li, Bo Dai, Wayne Wu
We present a new method for generating realistic and view-consistent images with fine geometry from 2D image collections.
no code implementations • 25 Sep 2023 • Rongzhang Gu, Hui Li, Changyue Su, Wayne Wu
Digital storytelling, as an art form, has struggled with cost-quality balance.
1 code implementation • ICCV 2023 • Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Wayne Wu, Ziwei Liu
A holistic human dataset inevitably has insufficient and low-resolution information on local parts.
1 code implementation • 5 Sep 2023 • Haonan Qiu, Zhaoxi Chen, Yuming Jiang, Hang Zhou, Xiangyu Fan, Lei Yang, Wayne Wu, Ziwei Liu
Our key insight is to decompose the portrait's reflectance from implicitly learned audio-driven facial normals and images.
no code implementations • 31 Aug 2023 • Linsen Song, Wayne Wu, Chaoyou Fu, Chen Change Loy, Ran He
Existing automated dubbing methods are usually designed for Professionally Generated Content (PGC) production, which requires massive training data and training time to learn a person-specific audio-video mapping.
no code implementations • ICCV 2023 • Yuxin Wang, Wayne Wu, Dan Xu
State-of-the-art methods in this direction typically consider building separate networks for these two tasks (i. e., view synthesis and editing).
1 code implementation • ICCV 2023 • Wei Cheng, Ruixiang Chen, Wanqi Yin, Siming Fan, Keyu Chen, Honglin He, Huiwen Luo, Zhongang Cai, Jingbo Wang, Yang Gao, Zhengming Yu, Zhengyu Lin, Daxuan Ren, Lei Yang, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Bo Dai, Kwan-Yee Lin
Realistic human-centric rendering plays a key role in both computer vision and computer graphics.
1 code implementation • NeurIPS 2023 • Dongwei Pan, Long Zhuo, Jingtan Piao, Huiwen Luo, Wei Cheng, Yuxin Wang, Siming Fan, Shengqi Liu, Lei Yang, Bo Dai, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Kwan-Yee Lin
It is a large-scale digital library for head avatars with three key attributes: 1) High Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K cameras in 360 degrees.
no code implementations • 19 Apr 2023 • Zhuo Chen, Xudong Xu, Yichao Yan, Ye Pan, Wenhan Zhu, Wayne Wu, Bo Dai, Xiaokang Yang
While the use of 3D-aware GANs bypasses the requirement of 3D data, we further alleviate the necessity of style images with the CLIP model being the stylization guidance.
1 code implementation • ICCV 2023 • Yuming Jiang, Shuai Yang, Tong Liang Koh, Wayne Wu, Chen Change Loy, Ziwei Liu
In this work, we present Text2Performer to generate vivid human videos with articulated motions from texts.
1 code implementation • CVPR 2023 • Zhengming Yu, Wei Cheng, Xian Liu, Wayne Wu, Kwan-Yee Lin
Recent works propose to graft a deformation network into the NeRF to further model the dynamics of the human neural field for animating vivid human motions.
1 code implementation • ICCV 2023 • Zhitao Yang, Zhongang Cai, Haiyi Mei, Shuai Liu, Zhaoxi Chen, Weiye Xiao, Yukun Wei, Zhongfei Qing, Chen Wei, Bo Dai, Wayne Wu, Chen Qian, Dahua Lin, Ziwei Liu, Lei Yang
Synthetic data has emerged as a promising source for 3D human research as it offers low-cost access to large-scale human datasets.
1 code implementation • CVPR 2023 • Jianhui Yu, Hao Zhu, Liming Jiang, Chen Change Loy, Weidong Cai, Wayne Wu
This paper presents CelebV-Text, a large-scale, diverse, and high-quality dataset of facial text-video pairs, to facilitate research on facial text-to-video generation tasks.
1 code implementation • CVPR 2023 • Tong Wu, Jiarui Zhang, Xiao Fu, Yuxin Wang, Jiawei Ren, Liang Pan, Wayne Wu, Lei Yang, Jiaqi Wang, Chen Qian, Dahua Lin, Ziwei Liu
Recent advances in modeling 3D objects mostly rely on synthetic datasets due to the lack of large-scale realscanned 3D databases.
1 code implementation • ICCV 2023 • Zhuoqian Yang, Shikai Li, Wayne Wu, Bo Dai
We present 3DHumanGAN, a 3D-aware generative adversarial network that synthesizes photorealistic images of full-body humans with consistent appearances under different view-angles and body-poses.
no code implementations • 5 Dec 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yuanqi Du, Wayne Wu, Dahua Lin, Ziwei Liu
Our key insight is that the co-speech gestures can be decomposed into common motion patterns and subtle rhythmic dynamics.
1 code implementation • 3 Dec 2022 • Jintao Lin, Zhaoyang Liu, Wenhai Wang, Wayne Wu, LiMin Wang
Our VLG is first pre-trained on video and language datasets to learn a shared feature space, and then devises a flexible bi-modal attention head to collaborate high-level semantic concepts under different settings.
1 code implementation • ICCV 2023 • Wentao Zhu, Xiaoxuan Ma, Zhaoyang Liu, Libin Liu, Wayne Wu, Yizhou Wang
We present a unified perspective on tackling various human-centric video tasks by learning human motion representations from large-scale and heterogeneous data resources.
Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (using extra training data)
1 code implementation • 16 Aug 2022 • Haonan Qiu, Yuming Jiang, Hang Zhou, Wayne Wu, Ziwei Liu
Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.
1 code implementation • 25 Jul 2022 • Hao Zhu, Wayne Wu, Wentao Zhu, Liming Jiang, Siwei Tang, Li Zhang, Ziwei Liu, Chen Change Loy
Large-scale datasets have played indispensable roles in the recent success of face generation/editing and significantly facilitated the advances of emerging research fields.
Ranked #1 on Unconditional Video Generation on CelebV-HQ
1 code implementation • 11 Jul 2022 • Long Zhuo, Guangcong Wang, Shikai Li, Wayne Wu, Ziwei Liu
In this paper, we present a spatial-temporal compression framework, \textbf{Fast-Vid2Vid}, which focuses on data aspects of generative models.
no code implementations • 30 Jun 2022 • Jiaqi Tang, Zhaoyang Liu, Jing Tan, Chen Qian, Wayne Wu, LiMin Wang
Local context modeling sub-network is proposed to perceive diverse patterns of generic event boundaries, and it generates powerful video representations and reliable boundary confidence.
2 code implementations • 31 May 2022 • Yuming Jiang, Shuai Yang, Haonan Qiu, Wayne Wu, Chen Change Loy, Ziwei Liu
In this work, we present a text-driven controllable framework, Text2Human, for a high-quality and diverse human generation.
no code implementations • 30 May 2022 • Xinya Ji, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Wayne Wu, Feng Xu, Xun Cao
Although significant progress has been made to audio-driven talking face generation, existing methods either neglect facial emotion or cannot be applied to arbitrary subjects.
2 code implementations • 25 Apr 2022 • Haoyue Cheng, Zhaoyang Liu, Hang Zhou, Chen Qian, Wayne Wu, LiMin Wang
This paper focuses on the weakly-supervised audio-visual video parsing task, which aims to recognize all events belonging to each modality and localize their temporal boundaries.
4 code implementations • 25 Apr 2022 • Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, Ziwei Liu
In addition, a model zoo and human editing applications are demonstrated to facilitate future research in the community.
1 code implementation • 25 Apr 2022 • Wei Cheng, Su Xu, Jingtan Piao, Chen Qian, Wayne Wu, Kwan-Yee Lin, Hongsheng Li
Specifically, we compress the light fields for novel view human rendering as conditional implicit neural radiance fields from both geometry and appearance aspects.
1 code implementation • CVPR 2022 • Yanbo Xu, Yueqin Yin, Liming Jiang, Qianyi Wu, Chengyao Zheng, Chen Change Loy, Bo Dai, Wayne Wu
In this study, we highlight the importance of interaction in a dual-space GAN for more controllable editing.
1 code implementation • CVPR 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yinghao Xu, Rui Qian, Xinyi Lin, Xiaowei Zhou, Wayne Wu, Bo Dai, Bolei Zhou
To enhance the quality of synthesized gestures, we develop a contrastive learning strategy based on audio-text alignment for better audio representations.
Ranked #3 on Gesture Generation on TED Gesture Dataset
no code implementations • 19 Jan 2022 • Xian Liu, Yinghao Xu, Qianyi Wu, Hang Zhou, Wayne Wu, Bolei Zhou
Moreover, to enable portrait rendering in one unified neural radiance field, a Torso Deformation module is designed to stabilize the large-scale non-rigid torso motions.
no code implementations • 19 Dec 2021 • Wentao Zhu, Zhuoqian Yang, Ziang Di, Wayne Wu, Yizhou Wang, Chen Change Loy
Trained with the canonicalization operations and the derived regularizations, our method learns to factorize a skeleton sequence into three independent semantic subspaces, i. e., motion, structure, and view angle.
3 code implementations • CVPR 2022 • Jiaqi Tang, Zhaoyang Liu, Chen Qian, Wayne Wu, LiMin Wang
Generic event boundary detection is an important yet challenging task in video understanding, which aims at detecting the moments where humans naturally perceive event boundaries.
2 code implementations • NeurIPS 2021 • Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy
Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images.
no code implementations • CVPR 2021 • Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He
We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.
1 code implementation • CVPR 2021 • Hang Zhou, Yasheng Sun, Wayne Wu, Chen Change Loy, Xiaogang Wang, Ziwei Liu
While speech content information can be defined by learning the intrinsic synchronization between audio-visual modalities, we identify that a pose code will be complementarily learned in a modulated convolution-based reconstruction framework.
1 code implementation • CVPR 2021 • Xinya Ji, Hang Zhou, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu
In this work, we present Emotional Video Portraits (EVP), a system for synthesizing high-quality video portraits with vivid emotional dynamics driven by audios.
1 code implementation • 7 Apr 2021 • Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He
We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.
2 code implementations • 18 Feb 2021 • Liming Jiang, Zhengkui Guo, Wayne Wu, Zhaoyang Liu, Ziwei Liu, Chen Change Loy, Shuo Yang, Yuanjun Xiong, Wei Xia, Baoying Chen, Peiyu Zhuang, Sili Li, Shen Chen, Taiping Yao, Shouhong Ding, Jilin Li, Feiyue Huang, Liujuan Cao, Rongrong Ji, Changlei Lu, Ganchao Tan
This paper reports methods and results in the DeeperForensics Challenge 2020 on real-world face forgery detection.
1 code implementation • ICCV 2021 • Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy
In this study, we show that narrowing gaps in the frequency domain can ameliorate image reconstruction and synthesis quality further.
no code implementations • NeurIPS 2020 • Hao Zhu, Chaoyou Fu, Qianyi Wu, Wayne Wu, Chen Qian, Ran He
However, due to the lack of Deepfakes datasets with large variance in appearance, which can be hardly produced by recent identity swapping methods, the detection algorithm may fail in this situation.
2 code implementations • ECCV 2020 • Xiaokang Chen, Kwan-Yee Lin, Jingbo Wang, Wayne Wu, Chen Qian, Hongsheng Li, Gang Zeng
Depth information has proven to be a useful cue in the semantic segmentation of RGB-D images for providing a geometric counterpart to the RGB representation.
2 code implementations • ICCV 2021 • Zhao-Yang Liu, Li-Min Wang, Wayne Wu, Chen Qian, Tong Lu
Video data is with complex temporal dynamics due to various factors such as camera motion, speed variation, and different activities.
no code implementations • CVPR 2020 • Zhuoqian Yang, Wentao Zhu, Wayne Wu, Chen Qian, Qiang Zhou, Bolei Zhou, Chen Change Loy
We present a lightweight video motion retargeting approach TransMoMo that is capable of transferring motion of a person in a source video realistically to another video of a target person.
no code implementations • 15 Jan 2020 • Linsen Song, Wayne Wu, Chen Qian, Ran He, Chen Change Loy
The audio-translated expression parameters are then used to synthesize a photo-realistic human subject in each video frame, with the movement of the mouth regions precisely mapped to the source audio.
1 code implementation • CVPR 2020 • Liming Jiang, Ren Li, Wayne Wu, Chen Qian, Chen Change Loy
The quality of generated videos outperforms those in existing datasets, validated by user studies.
1 code implementation • ICCV 2019 • Keqiang Sun, Wayne Wu, Tinghao Liu, Shuo Yang, Quan Wang, Qiang Zhou, Zuochang Ye, Chen Qian
A structure predictor is proposed to predict the missing face structural information temporally, which serves as a geometry prior.
no code implementations • ICCV 2019 • Shengju Qian, Kwan-Yee Lin, Wayne Wu, Yangxiaokang Liu, Quan Wang, Fumin Shen, Chen Qian, Ran He
Recent studies have shown remarkable success in face manipulation task with the advance of GANs and VAEs paradigms, but the outputs are sometimes limited to low-resolution and lack of diversity.
1 code implementation • ICCV 2019 • Shengju Qian, Keqiang Sun, Wayne Wu, Chen Qian, Jiaya Jia
Facial landmark detection, or face alignment, is a fundamental task that has been extensively studied.
Ranked #18 on Face Alignment on WFLW
1 code implementation • ICLR Workshop DeepGenStruct 2019 • Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy
It is challenging to disentangle an object into two orthogonal spaces of content and style since each can influence the visual observation differently and unpredictably.
no code implementations • CVPR 2019 • Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy
Extensive experiments demonstrate the superior performance of our method to other state-of-the-art approaches, especially in the challenging near-rigid and non-rigid objects translation tasks.
no code implementations • 27 Sep 2018 • Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy
It is challenging to disentangle an object into two orthogonal spaces of structure and appearance since each can influence the visual observation in a different and unpredictable way.
1 code implementation • ECCV 2018 • Wayne Wu, Yunxuan Zhang, Cheng Li, Chen Qian, Chen Change Loy
A transformer is subsequently used to adapt the boundary of source face to the boundary of target face.
2 code implementations • CVPR 2018 • Wayne Wu, Chen Qian, Shuo Yang, Quan Wang, Yici Cai, Qiang Zhou
By utilising boundary information of 300-W dataset, our method achieves 3. 92% mean error with 0. 39% failure rate on COFW dataset, and 1. 25% mean error on AFLW-Full dataset.
Ranked #4 on Face Alignment on AFLW-19 (using extra training data)