Search Results for author: Deli Zhao

Found 59 papers, 24 papers with code

Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions

1 code implementation • 30 May 2024 • Ruochen Zhao, Wenxuan Zhang, Yew Ken Chia, Deli Zhao, Lidong Bing

To provide an automatic, robust, and trustworthy evaluation framework, we innovatively propose the Auto-Arena of LLMs, which automates the entire evaluation process with LLM agents.

Chatbot Fairness

Paper
Code

Space Group Constrained Crystal Generation

no code implementations • 6 Feb 2024 • Rui Jiao, Wenbing Huang, Yu Liu, Deli Zhao, Yang Liu

Crystals are the foundation of numerous scientific and industrial applications.

Paper
Add Code

Latent Space Editing in Transformer-Based Flow Matching

no code implementations • 17 Dec 2023 • Vincent Tao Hu, David W Zhang, Pascal Mettes, Meng Tang, Deli Zhao, Cees G. M. Snoek

Flow Matching is an emerging generative modeling technique that offers the advantage of simple and efficient training.

Paper
Add Code

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

3 code implementations • 7 Nov 2023 • Shiwei Zhang, Jiayu Wang, Yingya Zhang, Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, Jingren Zhou

By this means, I2VGen-XL can simultaneously enhance the semantic accuracy, continuity of details and clarity of generated videos.

6,292

Paper
Code

Few-shot Action Recognition with Captioning Foundation Models

no code implementations • 16 Oct 2023 • Xiang Wang, Shiwei Zhang, Hangjie Yuan, Yingya Zhang, Changxin Gao, Deli Zhao, Nong Sang

In this paper, we develop an effective plug-and-play framework called CapFSAR to exploit the knowledge of multimodal models without manually annotating text.

Few-Shot action recognition Few Shot Action Recognition

Paper
Add Code

Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner

no code implementations • 14 Oct 2023 • Mengfei Xia, Yujun Shen, Changsong Lei, Yu Zhou, Ran Yi, Deli Zhao, Wenping Wang, Yong-Jin Liu

By viewing the generation of diffusion models as a discretized integrating process, we argue that the quality drop is partly caused by applying an inaccurate integral direction to a timestep interval.

Denoising

Paper
Add Code

Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers

no code implementations • ICCV 2023 • Shiyue Cao, Yueqin Yin, Lianghua Huang, Yu Liu, Xin Zhao, Deli Zhao, Kaiqi Huang

Vector-quantized image modeling has shown great potential in synthesizing high-quality images.

Image Generation Image Reconstruction +1

Paper
Add Code

In-Domain GAN Inversion for Faithful Reconstruction and Editability

no code implementations • 25 Sep 2023 • Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen, Bolei Zhou

This work fills in this gap by proposing in-domain GAN inversion, which consists of a domain-guided encoder and a domain-regularized optimizer, to regularize the inverted code in the native latent space of the pre-trained GAN model.

Image Generation Image Reconstruction

Paper
Add Code

Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning

1 code implementation • ICCV 2023 • Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Yingya Zhang, Changxin Gao, Deli Zhao, Nong Sang

When pre-training on the large-scale Kinetics-710, we achieve 89. 7% on Kinetics-400 with a frozen ViT-L model, which verifies the scalability of DiST.

Transfer Learning Video Recognition

Paper
Code

RLIPv2: Fast Scaling of Relational Language-Image Pre-training

3 code implementations • ICCV 2023 • Hangjie Yuan, Shiwei Zhang, Xiang Wang, Samuel Albanie, Yining Pan, Tao Feng, Jianwen Jiang, Dong Ni, Yingya Zhang, Deli Zhao

In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data.

Ranked #1 on Zero-Shot Human-Object Interaction Detection on HICO-DET (using extra training data)

Graph Generation Human-Object Interaction Detection +6

Paper
Code

Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models

no code implementations • ICCV 2023 • Kecheng Zheng, Wei Wu, Ruili Feng, Kai Zhu, Jiawei Liu, Deli Zhao, Zheng-Jun Zha, Wei Chen, Yujun Shen

To bring the useful knowledge back into light, we first identify a set of parameters that are important to a given downstream task, then attach a binary mask to each parameter, and finally optimize these masks on the downstream data with the parameters frozen.

Paper
Add Code

AnyDoor: Zero-shot Object-level Image Customization

2 code implementations • 18 Jul 2023 • Xi Chen, Lianghua Huang, Yu Liu, Yujun Shen, Deli Zhao, Hengshuang Zhao

This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations in a harmonious way.

Object Virtual Try-on

6,292

Paper
Code

MomentDiff: Generative Video Moment Retrieval from Random to Real

1 code implementation • NeurIPS 2023 • Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, Yongdong Zhang

Video moment retrieval pursues an efficient and generalized solution to identify the specific temporal segments within an untrimmed video that correspond to a given language description.

Moment Retrieval Retrieval

Paper
Code

Eliminating Lipschitz Singularities in Diffusion Models

no code implementations • 20 Jun 2023 • Zhantao Yang, Ruili Feng, Han Zhang, Yujun Shen, Kai Zhu, Lianghua Huang, Yifei Zhang, Yu Liu, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which employ stochastic differential equations to sample images through integrals, have emerged as a dominant class of generative models.

Paper
Add Code

VideoComposer: Compositional Video Synthesis with Motion Controllability

4 code implementations • NeurIPS 2023 • Xiang Wang, Hangjie Yuan, Shiwei Zhang, Dayou Chen, Jiuniu Wang, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou

The pursuit of controllability as a higher standard of visual content creation has yielded remarkable progress in customizable image synthesis.

Ranked #5 on Text-to-Video Generation on EvalCrafter Text-to-Video (ECTV) Dataset (using extra training data)

Image Generation Text-to-Video Generation

848

Paper
Code

Cones 2: Customizable Image Synthesis with Multiple Subjects

1 code implementation • 30 May 2023 • Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao

Synthesizing images with user-specified subjects has received growing attention due to its practical applications.

Image Generation

489

Paper
Code

MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition

1 code implementation • CVPR 2023 • Xiang Wang, Shiwei Zhang, Zhiwu Qing, Changxin Gao, Yingya Zhang, Deli Zhao, Nong Sang

To address these issues, we develop a Motion-augmented Long-short Contrastive Learning (MoLo) method that contains two crucial components, including a long-short contrastive objective and a motion autodecoder.

Contrastive Learning Few-Shot action recognition +1

Paper
Code

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

1 code implementation • CVPR 2023 • Zhengxiong Luo, Dayou Chen, Yingya Zhang, Yan Huang, Liang Wang, Yujun Shen, Deli Zhao, Jingren Zhou, Tieniu Tan

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distribution.

Ranked #7 on Video Generation on UCF-101

Code Generation Denoising +4

6,292

Paper
Code

Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos

1 code implementation • ICCV 2023 • Yulin Pan, Xiangteng He, Biao Gong, Yiliang Lv, Yujun Shen, Yuxin Peng, Deli Zhao

Video temporal grounding aims to pinpoint a video segment that matches the query description.

Paper
Code

ViM: Vision Middleware for Unified Downstream Transferring

no code implementations • ICCV 2023 • Yutong Feng, Biao Gong, Jianwen Jiang, Yiliang Lv, Yujun Shen, Deli Zhao, Jingren Zhou

ViM consists of a zoo of lightweight plug-in modules, each of which is independently learned on a midstream dataset with a shared frozen backbone.

Paper
Add Code

Cones: Concept Neurons in Diffusion Models for Customized Generation

1 code implementation • 9 Mar 2023 • Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao

Concatenating multiple clusters of concept neurons can vividly generate all related concepts in a single image.

6,292

Paper
Code

CLIP-guided Prototype Modulating for Few-shot Action Recognition

1 code implementation • 6 Mar 2023 • Xiang Wang, Shiwei Zhang, Jun Cen, Changxin Gao, Yingya Zhang, Deli Zhao, Nong Sang

Learning from large-scale contrastive language-image pre-training like CLIP has shown remarkable success in a wide range of downstream tasks recently, but it is still under-explored on the challenging few-shot action recognition (FSAR) task.

Few-Shot action recognition Few Shot Action Recognition

Paper
Code

Rethinking Efficient Tuning Methods from a Unified Perspective

no code implementations • 1 Mar 2023 • Zeyinzi Jiang, Chaojie Mao, Ziyuan Huang, Yiliang Lv, Deli Zhao, Jingren Zhou

The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning, which prove to achieve on-par or better performances on CIFAR-100 and FGVC datasets when compared with existing PETL methods.

Transfer Learning

Paper
Add Code

Composer: Creative and Controllable Image Synthesis with Composable Conditions

6 code implementations • 20 Feb 2023 • Lianghua Huang, Di Chen, Yu Liu, Yujun Shen, Deli Zhao, Jingren Zhou

Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.

Image Colorization Image-to-Image Translation +3

6,292

Paper
Code

UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training

no code implementations • 14 Feb 2023 • Biao Gong, Xiaoying Xie, Yutong Feng, Yiliang Lv, Yujun Shen, Deli Zhao

This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data.

Common Sense Reasoning

Paper
Add Code

The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition

1 code implementation • 8 Feb 2023 • Jun Cen, Di Luan, Shiwei Zhang, Yixuan Pei, Yingya Zhang, Deli Zhao, Shaojie Shen, Qifeng Chen

Recently, Unified Open-set Recognition (UOSR) has been proposed to reject not only unknown samples but also known but wrongly classified samples, which tends to be more practical in real-world applications.

Open Set Learning

Paper
Code

LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis

no code implementations • ICCV 2023 • Jiapeng Zhu, Ceyuan Yang, Yujun Shen, Zifan Shi, Bo Dai, Deli Zhao, Qifeng Chen

This work presents an easy-to-use regularizer for GAN training, which helps explicitly link some axes of the latent space to a set of pixels in the synthesized image.

Image Generation

Paper
Add Code

RA-CLIP: Retrieval Augmented Contrastive Language-Image Pre-Training

no code implementations • CVPR 2023 • Chen-Wei Xie, Siyang Sun, Xiong Xiong, Yun Zheng, Deli Zhao, Jingren Zhou

This process can be considered as an open-book exam: with the reference set as a cheat sheet, the proposed method doesn't need to memorize all visual concepts in the training data.

Classification Image Classification +5

Paper
Add Code

Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval

1 code implementation • ICCV 2023 • Pandeng Li, Chen-Wei Xie, Liming Zhao, Hongtao Xie, Jiannan Ge, Yun Zheng, Deli Zhao, Yongdong Zhang

In the event-sentence prototype matching phase, we design a temporal prototype generation mechanism to associate intra-frame objects and interact inter-frame temporal relations.

Object Retrieval +2

Paper
Code

LipFormer: High-Fidelity and Generalizable Talking Face Generation With a Pre-Learned Facial Codebook

no code implementations • CVPR 2023 • Jiayu Wang, Kang Zhao, Shiwei Zhang, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou

Generating a talking face video from the input audio sequence is a practical yet challenging task.

Talking Face Generation

Paper
Add Code

Self-Organizing Pathway Expansion for Non-Exemplar Class-Incremental Learning

no code implementations • ICCV 2023 • Kai Zhu, Kecheng Zheng, Ruili Feng, Deli Zhao, Yang Cao, Zheng-Jun Zha

Non-exemplar class-incremental learning aims to recognize both the old and new classes without access to old class samples.

Class Incremental Learning Incremental Learning

Paper
Add Code

Space-time Prompting for Video Class-incremental Learning

no code implementations • ICCV 2023 • Yixuan Pei, Zhiwu Qing, Shiwei Zhang, Xiang Wang, Yingya Zhang, Deli Zhao, Xueming Qian

In this paper, we will fill this gap by learning multiple prompts based on a powerful image-language pre-trained model, i. e., CLIP, making it fit for video class-incremental learning (VCIL).

Class Incremental Learning Incremental Learning

Paper
Add Code

Dimensionality-Varying Diffusion Process

no code implementations • CVPR 2023 • Han Zhang, Ruili Feng, Zhantao Yang, Lianghua Huang, Yu Liu, Yifei Zhang, Yujun Shen, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which learn to reverse a signal destruction process to generate new data, typically require the signal at each step to have the same dimension.

Image Generation

Paper
Add Code

Neural Dependencies Emerging from Learning Massive Categories

no code implementations • CVPR 2023 • Ruili Feng, Kecheng Zheng, Kai Zhu, Yujun Shen, Jian Zhao, Yukun Huang, Deli Zhao, Jingren Zhou, Michael Jordan, Zheng-Jun Zha

Through investigating the properties of the problem solution, we confirm that neural dependency is guaranteed by a redundant logit covariance matrix, which condition is easily met given massive categories, and that neural dependency is highly sparse, implying that one category correlates to only a few others.

Image Classification

Paper
Add Code

Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator

no code implementations • 30 Sep 2022 • Zifan Shi, Yinghao Xu, Yujun Shen, Deli Zhao, Qifeng Chen, Dit-yan Yeung

We argue that, considering the two-player game in the formulation of GANs, only making the generator 3D-aware is not enough.

3D-Aware Image Synthesis domain classification +2

Paper
Add Code

Improving GANs with A Dynamic Discriminator

no code implementations • 20 Sep 2022 • Ceyuan Yang, Yujun Shen, Yinghao Xu, Deli Zhao, Bo Dai, Bolei Zhou

Two capacity adjusting schemes are developed for training GANs under different data regimes: i) given a sufficient amount of training data, the discriminator benefits from a progressively increased learning capacity, and ii) when the training data is limited, gradually decreasing the layer width mitigates the over-fitting issue of the discriminator.

3D-Aware Image Synthesis Data Augmentation

Paper
Add Code

Rank Diminishing in Deep Neural Networks

no code implementations • 13 Jun 2022 • Ruili Feng, Kecheng Zheng, Yukun Huang, Deli Zhao, Michael Jordan, Zheng-Jun Zha

By virtue of our numerical tools, we provide the first empirical analysis of the per-layer behavior of network rank in practical settings, i. e., ResNets, deep MLPs, and Transformers on ImageNet.

Paper
Add Code

Principled Knowledge Extrapolation with GANs

no code implementations • 21 May 2022 • Ruili Feng, Jie Xiao, Kecheng Zheng, Deli Zhao, Jingren Zhou, Qibin Sun, Zheng-Jun Zha

Human can extrapolate well, generalize daily knowledge into unseen scenarios, raise and answer counterfactual questions.

counterfactual

Paper
Add Code

Region-Based Semantic Factorization in GANs

1 code implementation • 19 Feb 2022 • Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen

Despite the rapid advancement of semantic discovery in the latent space of Generative Adversarial Networks (GANs), existing approaches either are limited to finding global attributes or rely on a number of segmentation masks to identify local attributes.

Paper
Code

Low-Rank Subspaces in GANs

1 code implementation • NeurIPS 2021 • Jiapeng Zhu, Ruili Feng, Yujun Shen, Deli Zhao, ZhengJun Zha, Jingren Zhou, Qifeng Chen

Concretely, given an arbitrary image and a region of interest (e. g., eyes of face images), we manage to relate the latent space to the image region with the Jacobian matrix and then use low-rank factorization to discover steerable latent subspaces.

Attribute Generative Adversarial Network

123

Paper
Code

On Noise Injection in Generative Adversarial Networks

2 code implementations • 10 Jun 2020 • Ruili Feng, Deli Zhao, ZhengJun Zha

Noise injection has been proved to be one of the key technique advances in generating high-fidelity images.

Image Generation

10,844

Paper
Code

In-Domain GAN Inversion for Real Image Editing

2 code implementations • ECCV 2020 • Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou

A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.

Image Reconstruction

459

Paper
Code

Perceptual Image Super-Resolution with Progressive Adversarial Network

no code implementations • 8 Mar 2020 • Lone Wong, Deli Zhao, Shaohua Wan, Bo Zhang

Progressive growing enhances image resolution gradually, thereby preserving precision of recovered image.

Decoder Image Super-Resolution

Paper
Add Code

Latent Variables on Spheres for Autoencoders in High Dimensions

no code implementations • 21 Dec 2019 • Deli Zhao, Jiapeng Zhu, Bo Zhang

Variational Auto-Encoder (VAE) has been widely applied as a fundamental generative model in machine learning.

Vocal Bursts Intensity Prediction

Paper
Add Code

LIA: Latently Invertible Autoencoder with Adversarial Learning

no code implementations • 25 Sep 2019 • Jiapeng Zhu, Deli Zhao, Bolei Zhou, Bo Zhang

A two-stage stochasticity-free training scheme is designed to train LIA via adversarial learning, in the sense that the decoder of LIA is first trained as a standard GAN with the invertible network and then the partial encoder is learned from an autoencoder by detaching the invertible network from LIA.

Decoder Generative Adversarial Network +1

Paper
Add Code

Latent Variables on Spheres for Sampling and Inference

no code implementations • 25 Sep 2019 • Deli Zhao, Jiapeng Zhu, Bo Zhang

Variational inference is a fundamental problem in Variational AutoEncoder (VAE).

Variational Inference

Paper
Add Code

Curriculum Learning for Deep Generative Models with Clustering

no code implementations • 27 Jun 2019 • Deli Zhao, Jiapeng Zhu, Zhenfang Guo, Bo Zhang

The experiments on cat and human-face data validate that our algorithm is able to learn the optimal generative models (e. g. ProGAN) with respect to specified quality metrics for noisy data.

Clustering Generative Adversarial Network

Paper
Add Code

Disentangled Inference for GANs with Latently Invertible Autoencoder

3 code implementations • 19 Jun 2019 • Jiapeng Zhu, Deli Zhao, Bo Zhang, Bolei Zhou

In this paper, we show that the entanglement of the latent space for the VAE/GAN framework poses the main challenge for encoder learning.

Decoder

Paper
Code

DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning

no code implementations • NeurIPS 2018 • Runsheng Yu, Wenyu Liu, Yasen Zhang, Zhi Qu, Deli Zhao, Bo Zhang

Based on these sub-images, a local exposure for each sub-image is automatically learned by virtue of policy network sequentially while the reward of learning is globally designed for striking a balance of overall exposures.

Paper
Add Code

Few Shot Learning with Simplex

no code implementations • 27 Jul 2018 • Bowen Zhang, Xifan Zhang, Fan Cheng, Deli Zhao

During testing, combined with the test sample and the points in the class, a new simplex is formed.

Few-Shot Learning

Paper
Add Code

Sparse Coding and Dictionary Learning With Linear Dynamical Systems

no code implementations • CVPR 2016 • Wenbing Huang, Fuchun Sun, Lele Cao, Deli Zhao, Huaping Liu, Mehrtash Harandi

To enhance the performance of LDSs, in this paper, we address the challenging issue of performing sparse coding on the space of LDSs, where both data and dictionary atoms are LDSs.

Dictionary Learning Video Classification

Paper
Add Code

Network Representation Learning with Rich Text Information

3 code implementations • IJCAI 2015 • Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, Edward Chang

Representation learning has shown its effectiveness in many tasks such as image classification and text mining.

General Classification Image Classification +3

2,094

Paper
Code

A New Retraction for Accelerating the Riemannian Three-Factor Low-Rank Matrix Completion Algorithm

no code implementations • CVPR 2015 • Zhizhong Li, Deli Zhao, Zhouchen Lin, Edward Y. Chang

In the line search step, R3MC approximates the minimum point on the searching curve by minimizing on the line tangent to the curve.

Low-Rank Matrix Completion

Paper
Add Code

Zeta Hull Pursuits: Learning Nonconvex Data Hulls

no code implementations • NeurIPS 2014 • Yuanjun Xiong, Wei Liu, Deli Zhao, Xiaoou Tang

Selecting a small informative subset from a given dataset, also called column sampling, has drawn much attention in machine learning.

Image Classification

Paper
Add Code

Errata: Distant Supervision for Relation Extraction with Matrix Completion

no code implementations • 17 Nov 2014 • Miao Fan, Deli Zhao, Qiang Zhou, Zhiyuan Liu, Thomas Fang Zheng, Edward Y. Chang

The essence of distantly supervised relation extraction is that it is an incomplete multi-label classification problem with sparse and noisy features.

Classification General Classification +4

Paper
Add Code

Homophilic Clustering by Locally Asymmetric Geometry

no code implementations • 5 Jul 2014 • Deli Zhao, Xiaoou Tang

Clustering is indispensable for data analysis in many scientific disciplines.

Clustering

Paper
Add Code

Distant Supervision for Relation Extraction with Matrix Completion

1 code implementation • ACL 2014 • Miao Fan, Deli Zhao, Qiang Zhou, Zhiyuan Liu, Thomas Fang Zheng, Edward Y. Chang

Low-Rank Matrix Completion Multi-Label Classification +2

Paper
Code

Graph Degree Linkage: Agglomerative Clustering on a Directed Graph

2 code implementations • 25 Aug 2012 • Wei Zhang, Xiaogang Wang, Deli Zhao, Xiaoou Tang

We explore the different roles of two fundamental concepts in graph theory, indegree and outdegree, in the context of clustering.

Ranked #1 on Image Clustering on Coil-20 (Accuracy metric)

Clustering Computational Efficiency +1

Paper
Code

Cyclizing Clusters via Zeta Function of a Graph

no code implementations • NeurIPS 2008 • Deli Zhao, Xiaoou Tang

A mathematical tool, Zeta function of a graph, is introduced for the integration of all cycles, leading to a structural descriptor of the cluster in determinantal form.

Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.