1 code implementation • 30 May 2024 • Ruochen Zhao, Wenxuan Zhang, Yew Ken Chia, Deli Zhao, Lidong Bing
To provide an automatic, robust, and trustworthy evaluation framework, we innovatively propose the Auto-Arena of LLMs, which automates the entire evaluation process with LLM agents.
no code implementations • 6 Feb 2024 • Rui Jiao, Wenbing Huang, Yu Liu, Deli Zhao, Yang Liu
Crystals are the foundation of numerous scientific and industrial applications.
no code implementations • 17 Dec 2023 • Vincent Tao Hu, David W Zhang, Pascal Mettes, Meng Tang, Deli Zhao, Cees G. M. Snoek
Flow Matching is an emerging generative modeling technique that offers the advantage of simple and efficient training.
3 code implementations • 7 Nov 2023 • Shiwei Zhang, Jiayu Wang, Yingya Zhang, Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, Jingren Zhou
By this means, I2VGen-XL can simultaneously enhance the semantic accuracy, continuity of details and clarity of generated videos.
no code implementations • 16 Oct 2023 • Xiang Wang, Shiwei Zhang, Hangjie Yuan, Yingya Zhang, Changxin Gao, Deli Zhao, Nong Sang
In this paper, we develop an effective plug-and-play framework called CapFSAR to exploit the knowledge of multimodal models without manually annotating text.
no code implementations • 14 Oct 2023 • Mengfei Xia, Yujun Shen, Changsong Lei, Yu Zhou, Ran Yi, Deli Zhao, Wenping Wang, Yong-Jin Liu
By viewing the generation of diffusion models as a discretized integrating process, we argue that the quality drop is partly caused by applying an inaccurate integral direction to a timestep interval.
no code implementations • ICCV 2023 • Shiyue Cao, Yueqin Yin, Lianghua Huang, Yu Liu, Xin Zhao, Deli Zhao, Kaiqi Huang
Vector-quantized image modeling has shown great potential in synthesizing high-quality images.
no code implementations • 25 Sep 2023 • Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen, Bolei Zhou
This work fills in this gap by proposing in-domain GAN inversion, which consists of a domain-guided encoder and a domain-regularized optimizer, to regularize the inverted code in the native latent space of the pre-trained GAN model.
1 code implementation • ICCV 2023 • Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Yingya Zhang, Changxin Gao, Deli Zhao, Nong Sang
When pre-training on the large-scale Kinetics-710, we achieve 89. 7% on Kinetics-400 with a frozen ViT-L model, which verifies the scalability of DiST.
3 code implementations • ICCV 2023 • Hangjie Yuan, Shiwei Zhang, Xiang Wang, Samuel Albanie, Yining Pan, Tao Feng, Jianwen Jiang, Dong Ni, Yingya Zhang, Deli Zhao
In this paper, we propose RLIPv2, a fast converging model that enables the scaling of relational pre-training to large-scale pseudo-labelled scene graph data.
Ranked #1 on Zero-Shot Human-Object Interaction Detection on HICO-DET (using extra training data)
no code implementations • ICCV 2023 • Kecheng Zheng, Wei Wu, Ruili Feng, Kai Zhu, Jiawei Liu, Deli Zhao, Zheng-Jun Zha, Wei Chen, Yujun Shen
To bring the useful knowledge back into light, we first identify a set of parameters that are important to a given downstream task, then attach a binary mask to each parameter, and finally optimize these masks on the downstream data with the parameters frozen.
2 code implementations • 18 Jul 2023 • Xi Chen, Lianghua Huang, Yu Liu, Yujun Shen, Deli Zhao, Hengshuang Zhao
This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations in a harmonious way.
1 code implementation • NeurIPS 2023 • Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, Yongdong Zhang
Video moment retrieval pursues an efficient and generalized solution to identify the specific temporal segments within an untrimmed video that correspond to a given language description.
no code implementations • 20 Jun 2023 • Zhantao Yang, Ruili Feng, Han Zhang, Yujun Shen, Kai Zhu, Lianghua Huang, Yifei Zhang, Yu Liu, Deli Zhao, Jingren Zhou, Fan Cheng
Diffusion models, which employ stochastic differential equations to sample images through integrals, have emerged as a dominant class of generative models.
4 code implementations • NeurIPS 2023 • Xiang Wang, Hangjie Yuan, Shiwei Zhang, Dayou Chen, Jiuniu Wang, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou
The pursuit of controllability as a higher standard of visual content creation has yielded remarkable progress in customizable image synthesis.
Ranked #5 on Text-to-Video Generation on EvalCrafter Text-to-Video (ECTV) Dataset (using extra training data)
1 code implementation • 30 May 2023 • Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
Synthesizing images with user-specified subjects has received growing attention due to its practical applications.
1 code implementation • CVPR 2023 • Xiang Wang, Shiwei Zhang, Zhiwu Qing, Changxin Gao, Yingya Zhang, Deli Zhao, Nong Sang
To address these issues, we develop a Motion-augmented Long-short Contrastive Learning (MoLo) method that contains two crucial components, including a long-short contrastive objective and a motion autodecoder.
1 code implementation • CVPR 2023 • Zhengxiong Luo, Dayou Chen, Yingya Zhang, Yan Huang, Liang Wang, Yujun Shen, Deli Zhao, Jingren Zhou, Tieniu Tan
A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distribution.
Ranked #7 on Video Generation on UCF-101
1 code implementation • ICCV 2023 • Yulin Pan, Xiangteng He, Biao Gong, Yiliang Lv, Yujun Shen, Yuxin Peng, Deli Zhao
Video temporal grounding aims to pinpoint a video segment that matches the query description.
no code implementations • ICCV 2023 • Yutong Feng, Biao Gong, Jianwen Jiang, Yiliang Lv, Yujun Shen, Deli Zhao, Jingren Zhou
ViM consists of a zoo of lightweight plug-in modules, each of which is independently learned on a midstream dataset with a shared frozen backbone.
1 code implementation • 9 Mar 2023 • Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
Concatenating multiple clusters of concept neurons can vividly generate all related concepts in a single image.
1 code implementation • 6 Mar 2023 • Xiang Wang, Shiwei Zhang, Jun Cen, Changxin Gao, Yingya Zhang, Deli Zhao, Nong Sang
Learning from large-scale contrastive language-image pre-training like CLIP has shown remarkable success in a wide range of downstream tasks recently, but it is still under-explored on the challenging few-shot action recognition (FSAR) task.
no code implementations • 1 Mar 2023 • Zeyinzi Jiang, Chaojie Mao, Ziyuan Huang, Yiliang Lv, Deli Zhao, Jingren Zhou
The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning, which prove to achieve on-par or better performances on CIFAR-100 and FGVC datasets when compared with existing PETL methods.
6 code implementations • 20 Feb 2023 • Lianghua Huang, Di Chen, Yu Liu, Yujun Shen, Deli Zhao, Jingren Zhou
Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.
no code implementations • 14 Feb 2023 • Biao Gong, Xiaoying Xie, Yutong Feng, Yiliang Lv, Yujun Shen, Deli Zhao
This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data.
1 code implementation • 8 Feb 2023 • Jun Cen, Di Luan, Shiwei Zhang, Yixuan Pei, Yingya Zhang, Deli Zhao, Shaojie Shen, Qifeng Chen
Recently, Unified Open-set Recognition (UOSR) has been proposed to reject not only unknown samples but also known but wrongly classified samples, which tends to be more practical in real-world applications.
no code implementations • ICCV 2023 • Jiapeng Zhu, Ceyuan Yang, Yujun Shen, Zifan Shi, Bo Dai, Deli Zhao, Qifeng Chen
This work presents an easy-to-use regularizer for GAN training, which helps explicitly link some axes of the latent space to a set of pixels in the synthesized image.
no code implementations • CVPR 2023 • Chen-Wei Xie, Siyang Sun, Xiong Xiong, Yun Zheng, Deli Zhao, Jingren Zhou
This process can be considered as an open-book exam: with the reference set as a cheat sheet, the proposed method doesn't need to memorize all visual concepts in the training data.
1 code implementation • ICCV 2023 • Pandeng Li, Chen-Wei Xie, Liming Zhao, Hongtao Xie, Jiannan Ge, Yun Zheng, Deli Zhao, Yongdong Zhang
In the event-sentence prototype matching phase, we design a temporal prototype generation mechanism to associate intra-frame objects and interact inter-frame temporal relations.
no code implementations • CVPR 2023 • Jiayu Wang, Kang Zhao, Shiwei Zhang, Yingya Zhang, Yujun Shen, Deli Zhao, Jingren Zhou
Generating a talking face video from the input audio sequence is a practical yet challenging task.
no code implementations • ICCV 2023 • Kai Zhu, Kecheng Zheng, Ruili Feng, Deli Zhao, Yang Cao, Zheng-Jun Zha
Non-exemplar class-incremental learning aims to recognize both the old and new classes without access to old class samples.
no code implementations • ICCV 2023 • Yixuan Pei, Zhiwu Qing, Shiwei Zhang, Xiang Wang, Yingya Zhang, Deli Zhao, Xueming Qian
In this paper, we will fill this gap by learning multiple prompts based on a powerful image-language pre-trained model, i. e., CLIP, making it fit for video class-incremental learning (VCIL).
no code implementations • CVPR 2023 • Han Zhang, Ruili Feng, Zhantao Yang, Lianghua Huang, Yu Liu, Yifei Zhang, Yujun Shen, Deli Zhao, Jingren Zhou, Fan Cheng
Diffusion models, which learn to reverse a signal destruction process to generate new data, typically require the signal at each step to have the same dimension.
no code implementations • CVPR 2023 • Ruili Feng, Kecheng Zheng, Kai Zhu, Yujun Shen, Jian Zhao, Yukun Huang, Deli Zhao, Jingren Zhou, Michael Jordan, Zheng-Jun Zha
Through investigating the properties of the problem solution, we confirm that neural dependency is guaranteed by a redundant logit covariance matrix, which condition is easily met given massive categories, and that neural dependency is highly sparse, implying that one category correlates to only a few others.
no code implementations • 30 Sep 2022 • Zifan Shi, Yinghao Xu, Yujun Shen, Deli Zhao, Qifeng Chen, Dit-yan Yeung
We argue that, considering the two-player game in the formulation of GANs, only making the generator 3D-aware is not enough.
no code implementations • 20 Sep 2022 • Ceyuan Yang, Yujun Shen, Yinghao Xu, Deli Zhao, Bo Dai, Bolei Zhou
Two capacity adjusting schemes are developed for training GANs under different data regimes: i) given a sufficient amount of training data, the discriminator benefits from a progressively increased learning capacity, and ii) when the training data is limited, gradually decreasing the layer width mitigates the over-fitting issue of the discriminator.
no code implementations • 13 Jun 2022 • Ruili Feng, Kecheng Zheng, Yukun Huang, Deli Zhao, Michael Jordan, Zheng-Jun Zha
By virtue of our numerical tools, we provide the first empirical analysis of the per-layer behavior of network rank in practical settings, i. e., ResNets, deep MLPs, and Transformers on ImageNet.
no code implementations • 21 May 2022 • Ruili Feng, Jie Xiao, Kecheng Zheng, Deli Zhao, Jingren Zhou, Qibin Sun, Zheng-Jun Zha
Human can extrapolate well, generalize daily knowledge into unseen scenarios, raise and answer counterfactual questions.
1 code implementation • 19 Feb 2022 • Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen
Despite the rapid advancement of semantic discovery in the latent space of Generative Adversarial Networks (GANs), existing approaches either are limited to finding global attributes or rely on a number of segmentation masks to identify local attributes.
1 code implementation • NeurIPS 2021 • Jiapeng Zhu, Ruili Feng, Yujun Shen, Deli Zhao, ZhengJun Zha, Jingren Zhou, Qifeng Chen
Concretely, given an arbitrary image and a region of interest (e. g., eyes of face images), we manage to relate the latent space to the image region with the Jacobian matrix and then use low-rank factorization to discover steerable latent subspaces.
2 code implementations • 10 Jun 2020 • Ruili Feng, Deli Zhao, ZhengJun Zha
Noise injection has been proved to be one of the key technique advances in generating high-fidelity images.
2 code implementations • ECCV 2020 • Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
no code implementations • 8 Mar 2020 • Lone Wong, Deli Zhao, Shaohua Wan, Bo Zhang
Progressive growing enhances image resolution gradually, thereby preserving precision of recovered image.
no code implementations • 21 Dec 2019 • Deli Zhao, Jiapeng Zhu, Bo Zhang
Variational Auto-Encoder (VAE) has been widely applied as a fundamental generative model in machine learning.
no code implementations • 25 Sep 2019 • Jiapeng Zhu, Deli Zhao, Bolei Zhou, Bo Zhang
A two-stage stochasticity-free training scheme is designed to train LIA via adversarial learning, in the sense that the decoder of LIA is first trained as a standard GAN with the invertible network and then the partial encoder is learned from an autoencoder by detaching the invertible network from LIA.
no code implementations • 25 Sep 2019 • Deli Zhao, Jiapeng Zhu, Bo Zhang
Variational inference is a fundamental problem in Variational AutoEncoder (VAE).
no code implementations • 27 Jun 2019 • Deli Zhao, Jiapeng Zhu, Zhenfang Guo, Bo Zhang
The experiments on cat and human-face data validate that our algorithm is able to learn the optimal generative models (e. g. ProGAN) with respect to specified quality metrics for noisy data.
3 code implementations • 19 Jun 2019 • Jiapeng Zhu, Deli Zhao, Bo Zhang, Bolei Zhou
In this paper, we show that the entanglement of the latent space for the VAE/GAN framework poses the main challenge for encoder learning.
no code implementations • NeurIPS 2018 • Runsheng Yu, Wenyu Liu, Yasen Zhang, Zhi Qu, Deli Zhao, Bo Zhang
Based on these sub-images, a local exposure for each sub-image is automatically learned by virtue of policy network sequentially while the reward of learning is globally designed for striking a balance of overall exposures.
no code implementations • 27 Jul 2018 • Bowen Zhang, Xifan Zhang, Fan Cheng, Deli Zhao
During testing, combined with the test sample and the points in the class, a new simplex is formed.
no code implementations • CVPR 2016 • Wenbing Huang, Fuchun Sun, Lele Cao, Deli Zhao, Huaping Liu, Mehrtash Harandi
To enhance the performance of LDSs, in this paper, we address the challenging issue of performing sparse coding on the space of LDSs, where both data and dictionary atoms are LDSs.
3 code implementations • IJCAI 2015 • Cheng Yang, Zhiyuan Liu, Deli Zhao, Maosong Sun, Edward Chang
Representation learning has shown its effectiveness in many tasks such as image classification and text mining.
no code implementations • CVPR 2015 • Zhizhong Li, Deli Zhao, Zhouchen Lin, Edward Y. Chang
In the line search step, R3MC approximates the minimum point on the searching curve by minimizing on the line tangent to the curve.
no code implementations • NeurIPS 2014 • Yuanjun Xiong, Wei Liu, Deli Zhao, Xiaoou Tang
Selecting a small informative subset from a given dataset, also called column sampling, has drawn much attention in machine learning.
no code implementations • 17 Nov 2014 • Miao Fan, Deli Zhao, Qiang Zhou, Zhiyuan Liu, Thomas Fang Zheng, Edward Y. Chang
The essence of distantly supervised relation extraction is that it is an incomplete multi-label classification problem with sparse and noisy features.
no code implementations • 5 Jul 2014 • Deli Zhao, Xiaoou Tang
Clustering is indispensable for data analysis in many scientific disciplines.
2 code implementations • 25 Aug 2012 • Wei Zhang, Xiaogang Wang, Deli Zhao, Xiaoou Tang
We explore the different roles of two fundamental concepts in graph theory, indegree and outdegree, in the context of clustering.
Ranked #1 on Image Clustering on Coil-20 (Accuracy metric)
no code implementations • NeurIPS 2008 • Deli Zhao, Xiaoou Tang
A mathematical tool, Zeta function of a graph, is introduced for the integration of all cycles, leading to a structural descriptor of the cluster in determinantal form.