no code implementations • 11 May 2024 • Wang Lin, Jingyuan Chen, Jiaxin Shi, Yichen Zhu, Chen Liang, Junzhong Miao, Tao Jin, Zhou Zhao, Fei Wu, Shuicheng Yan, Hanwang Zhang
We tackle the common challenge of inter-concept visual confusion in compositional concept generation using text-guided diffusion models (TGDMs).
1 code implementation • 10 Mar 2024 • Minjie Zhu, Yichen Zhu, Xin Liu, Ning Liu, Zhiyuan Xu, Chaomin Shen, Yaxin Peng, Zhicai Ou, Feifei Feng, Jian Tang
Multimodal Large Language Models (MLLMs) have showcased impressive skills in tasks related to visual understanding and reasoning.
Ranked #88 on Visual Question Answering on MM-Vet
1 code implementation • 1 Feb 2024 • Xin Liu, Yichen Zhu, Yunshi Lan, Chao Yang, Yu Qiao
In this paper, we systematically survey current efforts on the evaluation, attack, and defense of MLLMs' safety on images and text.
no code implementations • 31 Jan 2024 • Dong Chen, Ning Liu, Yichen Zhu, Zhengping Che, Rui Ma, Fachao Zhang, Xiaofeng Mou, Yi Chang, Jian Tang
Instead of a simple combination of pruning and SD, EPSD enables the pruned network to favor SD by keeping more distillable weights before training to ensure better distillation of the pruned network.
1 code implementation • 18 Jan 2024 • Ruizhe Zhang, Xinke Jiang, Yuchen Fang, Jiayuan Luo, Yongxin Xu, Yichen Zhu, Xu Chu, Junfeng Zhao, Yasha Wang
Graph Neural Networks (GNNs) have shown considerable effectiveness in a variety of graph learning tasks, particularly those based on the message-passing approach in recent years.
no code implementations • 8 Jan 2024 • Minjie Zhu, Yichen Zhu, Jinming Li, Junjie Wen, Zhiyuan Xu, Zhengping Che, Chaomin Shen, Yaxin Peng, Dong Liu, Feifei Feng, Jian Tang
The language-conditioned robotic manipulation aims to transfer natural language instructions into executable actions, from simple pick-and-place to tasks requiring intent recognition and visual reasoning.
no code implementations • 5 Jan 2024 • Junjie Wen, Yichen Zhu, Minjie Zhu, Jinming Li, Zhiyuan Xu, Zhengping Che, Chaomin Shen, Yaxin Peng, Dong Liu, Feifei Feng, Jian Tang
Humans interpret scenes by recognizing both the identities and positions of objects in their observations.
1 code implementation • 4 Jan 2024 • Yichen Zhu, Minjie Zhu, Ning Liu, Zhicai Ou, Xiaofeng Mou, Jian Tang
In this paper, we introduce LLaVA-$\phi$ (LLaVA-Phi), an efficient multi-modal assistant that harnesses the power of the recently advanced small language model, Phi-2, to facilitate multi-modal dialogues.
Ranked #103 on Visual Question Answering on MM-Vet
no code implementations • 18 Dec 2023 • Wanying Wang, Yichen Zhu, Yirui Zhou, Chaomin Shen, Jian Tang, Zhiyuan Xu, Yaxin Peng, Yangchun Zhang
Generative Adversarial Imitation Learning (GAIL) stands as a cornerstone approach in imitation learning.
1 code implementation • 29 Nov 2023 • Xin Liu, Yichen Zhu, Jindong Gu, Yunshi Lan, Chao Yang, Yu Qiao
The security concerns surrounding Large Language Models (LLMs) have been extensively explored, yet the safety of Multimodal Large Language Models (MLLMs) remains understudied.
no code implementations • 15 Oct 2023 • Zijian Zhang, Luping Liu, Zhijie Lin, Yichen Zhu, Zhou Zhao
We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models.
no code implementations • 25 Jul 2023 • Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao
3D visual grounding aims to localize the target object in a 3D point cloud by a free-form language description.
no code implementations • 24 Jul 2023 • Jiaben Chen, Yichen Zhu, Dongze Lian, Jiaqi Yang, Yifu Wang, Renrui Zhang, Xinhang Liu, Shenhan Qian, Laurent Kneip, Shenghua Gao
We therefore propose to incorporate RGB information in an event-guided optical flow refinement strategy.
1 code implementation • ICCV 2023 • Zehan Wang, Haifeng Huang, Yang Zhao, Linjun Li, Xize Cheng, Yichen Zhu, Aoxiong Yin, Zhou Zhao
To accomplish this, we design a novel semantic matching model that analyzes the semantic similarity between object proposals and sentences in a coarse-to-fine manner.
no code implementations • 5 Jul 2023 • Qiqi Zhou, Yichen Zhu
The TLA enables ReViT to process images with the minimum sufficient number of tokens, reducing token numbers in the ViT model and improving inference speed.
no code implementations • 18 May 2023 • Yichen Zhu, Jian Yuan, Bo Jiang, Tao Lin, Haiming Jin, Xinbing Wang, Chenghu Zhou
We focus on the case where the underlying joint distribution of complete features and label is invariant, but the missing pattern, i. e., mask distribution may shift agnostically between training and testing.
no code implementations • CVPR 2023 • Yichen Zhu, Qiqi Zhou, Ning Liu, Zhiyuan Xu, Zhicai Ou, Xiaofeng Mou, Jian Tang
Unlike existing works that struggle to balance the trade-off between inference speed and SOD performance, in this paper, we propose a novel Scale-aware Knowledge Distillation (ScaleKD), which transfers knowledge of a complex teacher model to a compact student model.
1 code implementation • 24 Jul 2022 • Yaomin Huang, Xinmei Liu, Yichen Zhu, Zhiyuan Xu, Chaomin Shen, Zhengping Che, Guixu Zhang, Yaxin Peng, Feifei Feng, Jian Tang
Detecting 3D objects from point clouds is a practical yet challenging task that has attracted increasing attention recently.
no code implementations • 16 Feb 2022 • Matt Davison, Marcos Escobar-Anel, Yichen Zhu
This paper investigates the optimal choices of financial derivatives to complete a financial market in the framework of stochastic volatility (SV) models.
no code implementations • 11 Jan 2022 • Marcos Escobar-Anel, Matt Davison, Yichen Zhu
This paper challenges the use of stocks in portfolio construction, instead we demonstrate that Asian derivatives, straddles, or baskets could be more convenient substitutes.
no code implementations • 6 Dec 2021 • Yichen Zhu, Weibin Meng, Ying Liu, Shenglin Zhang, Tao Han, Shimin Tao, Dan Pei
UniLog: Deploy One Model and Specialize it for All Log Analysis Tasks
no code implementations • 3 Dec 2021 • Yichen Zhu, Yuqin Zhu, Jie Du, Yi Wang, Zhicai Ou, Feifei Feng, Jian Tang
The TLA enables the ReViT to process the image with the minimum sufficient number of tokens during inference.
no code implementations • 1 Dec 2021 • Yichen Zhu, Jie Du, Yuqin Zhu, Yi Wang, Zhicai Ou, Feifei Feng, Jian Tang
Critically, there is no effort to understand 1) why training BatchNorm only can find the perform-well architectures with the reduced supernet-training time, and 2) what is the difference between the train-BN-only supernet and the standard-train supernet.
no code implementations • 5 Oct 2021 • Yichen Zhu, Bo Jiang, Haiming Jin, Mengtian Zhang, Feng Gao, Jianqiang Huang, Tao Lin, Xinbing Wang
An important task in such applications is to predict the future values of a NETS based on its historical values and the underlying graph.
no code implementations • ICCV 2021 • Yichen Zhu, Yi Wang
We formulate the knowledge distillation as a multi-task learning problem so that the teacher transfers knowledge to the student only if the student can benefit from learning such knowledge.
no code implementations • 26 Apr 2020 • Yichen Zhu, Cheng Li, David B. Dunson
When data are limited in one or more of the classes, the estimated decision boundaries are often irregularly shaped due to the limited sample size, leading to poor generalization error.
no code implementations • 25 Sep 2019 • Yichen Zhu, Xiangyu Zhang, Tong Yang, Jian Sun
We introduce the adaptive resizable networks as dynamic networks, which further improve the performance with less computational cost via data-dependent inference.
no code implementations • 25 Sep 2019 • Shizheng Qin, Yichen Zhu, Pengfei Hou, Xiangyu Zhang, Wenqiang Zhang, Jian Sun
In this paper, we propose a learnable sampling module based on variational auto-encoder (VAE) for neural architecture search (NAS), named as VAENAS, which can be easily embedded into existing weight sharing NAS framework, e. g., one-shot approach and gradient-based approach, and significantly improve the performance of searching results.
1 code implementation • 13 May 2019 • Huichu Zhang, Siyuan Feng, Chang Liu, Yaoyao Ding, Yichen Zhu, Zihan Zhou, Wei-Nan Zhang, Yong Yu, Haiming Jin, Zhenhui Li
The most commonly used open-source traffic simulator SUMO is, however, not scalable to large road network and large traffic flow, which hinders the study of reinforcement learning on traffic scenarios.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • ICML 2018 • Pengtao Xie, Hongbao Zhang, Yichen Zhu, Eric Xing
Variable selection is a classic problem in machine learning (ML), widely used to find important explanatory factors, and improve generalization performance and interpretability of ML models.
no code implementations • ICML 2018 • Pengtao Xie, Wei Wu, Yichen Zhu, Eric P. Xing
In this paper, we address these three issues by (1) seeking convex relaxations of the original nonconvex problems so that the global optimal is guaranteed to be achievable; (2) providing a formal analysis on OPR's capability of promoting balancedness; (3) providing a theoretical analysis that directly reveals the relationship between OPR and generalization performance.