Search Results for author: Di Zhang

Found 48 papers, 15 papers with code

VideoTetris: Towards Compositional Text-to-Video Generation

1 code implementation • 6 Jun 2024 • Ye Tian, Ling Yang, Haotian Yang, Yuan Gao, Yufan Deng, Jingmin Chen, Xintao Wang, Zhaochen Yu, Xin Tao, Pengfei Wan, Di Zhang, Bin Cui

Diffusion models have demonstrated great success in text-to-video (T2V) generation.

Denoising Text-to-Video Generation +1

Paper
Code

A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies

no code implementations • 31 May 2024 • Jinchao Zhu, Yuxuan Wang, Siyuan Pan, Pengfei Wan, Di Zhang, Gao Huang

1) For the tuning method, we design a model assembly strategy to reconstruct a lightweight model while preserving performance through distillation.

Paper
Add Code

Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs

no code implementations • 24 May 2024 • Chenxi Sun, Hongzhi Zhang, Zijia Lin, Jingyuan Zhang, Fuzheng Zhang, Zhongyuan Wang, Bin Chen, Chengru Song, Di Zhang, Kun Gai, Deyi Xiong

The core of our approach is the observation that a pre-trained language model can confidently predict multiple contiguous tokens, forming the basis for a \textit{lexical unit}, in which these contiguous tokens could be decoded in parallel.

Code Generation Language Modelling +3

Paper
Add Code

Learning Multi-dimensional Human Preference for Text-to-Image Generation

no code implementations • 23 May 2024 • Sixian Zhang, Bohan Wang, Junqiang Wu, Yan Li, Tingting Gao, Di Zhang, Zhongyuan Wang

Current metrics for text-to-image models typically rely on statistical metrics which inadequately represent the real preference of humans.

Text-to-Image Generation

Paper
Add Code

RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance

1 code implementation • 23 May 2024 • Zhicheng Sun, Zhenhao Yang, Yang Jin, Haozhe Chi, Kun Xu, Liwei Chen, Hao Jiang, Di Zhang, Yang song, Kun Gai, Yadong Mu

Our study shows that based on a recent rectified flow framework, the major limitation of vanilla classifier guidance in requiring a special classifier can be resolved with a simple fixed-point solution, allowing flexible personalization with off-the-shelf image discriminators.

Image Generation

Paper
Code

Evaluating Concept-based Explanations of Language Models: A Study on Faithfulness and Readability

1 code implementation • 29 Apr 2024 • Meng Li, Haoran Jin, Ruixuan Huang, Zhihao Xu, Defu Lian, Zijia Lin, Di Zhang, Xiting Wang

Based on this, we quantify faithfulness via the difference in the output upon perturbation.

Paper
Code

Inductive-Deductive Strategy Reuse for Multi-Turn Instructional Dialogues

no code implementations • 17 Apr 2024 • Jiao Ou, Jiayu Wu, Che Liu, Fuzheng Zhang, Di Zhang, Kun Gai

Existing methods target instructions from real instruction dialogues as a learning goal and fine-tune a user simulator for posing instructions.

Paper
Add Code

UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark

no code implementations • 15 Apr 2024 • Zhaokun Zhou, Qiulin Wang, Bin Lin, Yiwei Su, Rui Chen, Xin Tao, Amin Zheng, Li Yuan, Pengfei Wan, Di Zhang

To further evaluate the IAA capability of MLLMs, we construct the UNIAA-Bench, which consists of three aesthetic levels: Perception, Description, and Assessment.

Language Modelling Large Language Model

Paper
Add Code

End-to-end training of Multimodal Model and ranking Model

2 code implementations • 9 Apr 2024 • Xiuqi Deng, Lu Xu, Xiyao Li, Jinkai Yu, Erpeng Xue, Zhongyuan Wang, Di Zhang, Zhaojie Liu, Guorui Zhou, Yang song, Na Mou, Shen Jiang, Han Li

In this paper, we propose an industrial multimodal recommendation framework named EM3: End-to-end training of Multimodal Model and ranking Model, which sufficiently utilizes multimodal information and allows personalized ranking tasks to directly train the core modules in the multimodal model to obtain more task-oriented content features, without overburdening resource consumption.

Contrastive Learning Multimodal Recommendation

294

Paper
Code

Motion Inversion for Video Customization

no code implementations • 29 Mar 2024 • Luozhou Wang, Guibao Shen, Yixun Liang, Xin Tao, Pengfei Wan, Di Zhang, Yijun Li, Yingcong Chen

In this research, we present a novel approach to motion customization in video generation, addressing the widespread gap in the thorough exploration of motion representation within video generative models.

Video Generation

Paper
Add Code

DouRN: Improving DouZero by Residual Neural Networks

no code implementations • 21 Mar 2024 • Yiquan Chen, Yingchao Lyu, Di Zhang

Deep reinforcement learning has made significant progress in games with imperfect information, but its performance in the card game Doudizhu (Chinese Poker/Fight the Landlord) remains unsatisfactory.

Paper
Add Code

DragAnything: Motion Control for Anything using Entity Representation

2 code implementations • 12 Mar 2024 • Weijia Wu, Zhuang Li, YuChao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang

We introduce DragAnything, which utilizes a entity representation to achieve motion control for any object in controllable video generation.

Object Video Generation

318

Paper
Code

A Survey on Applications of Reinforcement Learning in Spatial Resource Allocation

no code implementations • 6 Mar 2024 • Di Zhang, Moyang Wang, Joseph Mango, Xiang Li, Xianrui Xu

Given these advancements, there has been a surge in novel methods employing reinforcement learning to tackle spatial resource allocation problems.

Decision Making reinforcement-learning

Paper
Add Code

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

1 code implementation • 26 Feb 2024 • Zhexin Zhang, Yida Lu, Jingyuan Ma, Di Zhang, Rui Li, Pei Ke, Hao Sun, Lei Sha, Zhifang Sui, Hongning Wang, Minlie Huang

The safety of Large Language Models (LLMs) has gained increasing attention in recent years, but there still lacks a comprehensive approach for detecting safety issues within LLMs' responses in an aligned, customizable and explainable manner.

Paper
Code

Enhancing Role-playing Systems through Aggressive Queries: Evaluation and Improvement

no code implementations • 16 Feb 2024 • Yihong Tang, Jiao Ou, Che Liu, Fuzheng Zhang, Di Zhang, Kun Gai

Experiments on models improved by RoleAD indicate that our adversarial dataset ameliorates this deficiency, with the improvements demonstrating a degree of generalizability in ordinary scenarios.

Dialogue Generation

Paper
Add Code

ChemLLM: A Chemical Large Language Model

no code implementations • 10 Feb 2024 • Di Zhang, Wei Liu, Qian Tan, Jingdan Chen, Hang Yan, Yuliang Yan, Jiatong Li, Weiran Huang, Xiangyu Yue, Wanli Ouyang, Dongzhan Zhou, Shufei Zhang, Mao Su, Han-sen Zhong, Yuqiang Li

However, the community lacks an LLM specifically designed for chemistry.

Language Modelling Large Language Model +2

Paper
Add Code

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

1 code implementation • 5 Feb 2024 • Yang Jin, Zhicheng Sun, Kun Xu, Liwei Chen, Hao Jiang, Quzhe Huang, Chengru Song, Yuliang Liu, Di Zhang, Yang song, Kun Gai, Yadong Mu

In light of recent advances in multimodal Large Language Models (LLMs), there is increasing attention to scaling them from image-text data to more informative real-world videos.

Ranked #2 on Visual Question Answering on MMBench

Science Question Answering Text-to-Video Generation +3

405

Paper
Code

Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion

no code implementations • 5 Feb 2024 • Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, Jing Liao

In practice, users often desire the ability to control object motion and camera movement independently for customized video creation.

Object Video Generation

Paper
Add Code

Rethinking Cross-Attention for Infrared and Visible Image Fusion

no code implementations • 22 Jan 2024 • Lihua Jian, Songlei Xiong, Han Yan, Xiaoguang Niu, Shaowu Wu, Di Zhang

The DIIM is designed by modifying the vanilla cross-attention mechanism, which can promote the extraction of the discrepancy information of the source images.

Infrared And Visible Image Fusion

Paper
Add Code

Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint

1 code implementation • 11 Jan 2024 • Zhipeng Chen, Kun Zhou, Wayne Xin Zhao, Junchen Wan, Fuzheng Zhang, Di Zhang, Ji-Rong Wen

To address it, we propose a new RL method named \textbf{RLMEC} that incorporates a generative model as the reward model, which is trained by the erroneous solution rewriting task under the minimum editing constraint, and can produce token-level rewards for RL training.

Question Answering Reinforcement Learning (RL)

Paper
Code

I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models

no code implementations • 27 Dec 2023 • Xun Guo, Mingwu Zheng, Liang Hou, Yuan Gao, Yufan Deng, Pengfei Wan, Di Zhang, Yufan Liu, Weiming Hu, ZhengJun Zha, Haibin Huang, Chongyang Ma

I2V-Adapter adeptly propagates the unnoised input image to subsequent noised frames through a cross-frame attention mechanism, maintaining the identity of the input image without any changes to the pretrained T2V model.

Video Generation

Paper
Add Code

Paragraph-to-Image Generation with Information-Enriched Diffusion Model

1 code implementation • 24 Nov 2023 • Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang, Zhongyuan Wang

In this paper, we introduce an information-enriched diffusion model for paragraph-to-image generation task, termed ParaDiffusion, which delves into the transference of the extensive semantic comprehension capabilities of large language models to the task of image generation.

Image Generation Language Modelling +1

Paper
Code

Just Ask One More Time! Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios

no code implementations • 14 Nov 2023 • Lei Lin, Jiayi Fu, Pengli Liu, Qingyang Li, Yan Gong, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Although chain-of-thought (CoT) prompting combined with language models has achieved encouraging results on complex reasoning tasks, the naive greedy decoding used in CoT prompting usually causes the repetitiveness and local optimality.

Decoder Language Modelling

Paper
Add Code

DialogBench: Evaluating LLMs as Human-like Dialogue Systems

1 code implementation • 3 Nov 2023 • Jiao Ou, Junda Lu, Che Liu, Yihong Tang, Fuzheng Zhang, Di Zhang, Kun Gai

In this paper, we propose DialogBench, a dialogue evaluation benchmark that contains 12 dialogue tasks to probe the capabilities of LLMs as human-like dialogue systems should have.

Dialogue Evaluation

Paper
Code

USDC: Unified Static and Dynamic Compression for Visual Transformer

no code implementations • 17 Oct 2023 • Huan Yuan, Chao Liao, Jianchao Tan, Peng Yao, Jiyuan Jia, Bin Chen, Chengru Song, Di Zhang

To alleviate two disadvantages of two categories of methods, we propose to unify the static compression and dynamic compression techniques jointly to obtain an input-adaptive compressed model, which can further better balance the total compression ratios and the model performances.

Model Compression

Paper
Add Code

ASP: Automatic Selection of Proxy dataset for efficient AutoML

no code implementations • 17 Oct 2023 • Peng Yao, Chao Liao, Jiyuan Jia, Jianchao Tan, Bin Chen, Chengru Song, Di Zhang

Deep neural networks have gained great success due to the increasing amounts of data, and diverse effective neural network designs.

Neural Architecture Search

Paper
Add Code

Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models

no code implementations • 11 Oct 2023 • Yuchong Sun, Che Liu, Kun Zhou, Jinwen Huang, Ruihua Song, Wayne Xin Zhao, Fuzheng Zhang, Di Zhang, Kun Gai

In this paper, we introduce Parrot, a solution aiming to enhance multi-turn instruction following for LLMs.

Attribute Instruction Following

Paper
Add Code

KwaiYiiMath: Technical Report

no code implementations • 11 Oct 2023 • Jiayi Fu, Lei Lin, Xiaoyang Gao, Pengli Liu, Zhengzong Chen, Zhirui Yang, ShengNan Zhang, Xue Zheng, Yan Li, Yuliang Liu, Xucheng Ye, Yiqiao Liao, Chao Liao, Bin Chen, Chengru Song, Junchen Wan, Zijia Lin, Fuzheng Zhang, Zhongyuan Wang, Di Zhang, Kun Gai

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning.

Ranked #88 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +1

Paper
Add Code

Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization

1 code implementation • 9 Sep 2023 • Yang Jin, Kun Xu, Liwei Chen, Chao Liao, Jianchao Tan, Quzhe Huang, Bin Chen, Chenyi Lei, An Liu, Chengru Song, Xiaoqiang Lei, Di Zhang, Wenwu Ou, Kun Gai, Yadong Mu

Specifically, we introduce a well-designed visual tokenizer to translate the non-linguistic image into a sequence of discrete tokens like a foreign language that LLM can read.

Language Modelling Large Language Model +1

405

Paper
Code

Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks

1 code implementation • 9 Aug 2023 • Jue Chen, Huan Yuan, Jianchao Tan, Bin Chen, Chengru Song, Di Zhang

We propose an improved end-to-end Minimax optimization method for this sparse learning problem to better balance the model performance and the computation efficiency.

Model Compression Sparse Learning

Paper
Code

Safe Reinforcement Learning with Dead-Ends Avoidance and Recovery

no code implementations • 24 Jun 2023 • Xiao Zhang, Hai Zhang, Hongtu Zhou, Chang Huang, Di Zhang, Chen Ye, Junqiao Zhao

In this paper, we propose a method to construct a boundary that discriminates safe and unsafe states.

Continuous Control reinforcement-learning +2

Paper
Add Code

Cyclic Delay-Doppler Shift: A Simple Transmit Diversity Technique for Delay-Doppler Waveforms in Doubly Selective Channels

no code implementations • 22 Feb 2023 • Haoran Yin, Jiaojiao Xiong, Yu Zhou, Chi Zhang, Di Zhang, Xizhang Wei, Yanqun Tang

Delay-Doppler waveform design has been considered as a promising solution to achieve reliable communication under high-mobility channels for the space-air-ground-integrated networks (SAGIN).

Paper
Add Code

ClusterLog: Clustering Logs for Effective Log-based Anomaly Detection

no code implementations • 19 Jan 2023 • Chris Egersdoerfer, Dong Dai, Di Zhang

With the increasing prevalence of scalable file systems in the context of High Performance Computing (HPC), the importance of accurate anomaly detection on runtime logs is increasing.

Anomaly Detection Clustering +2

Paper
Add Code

Optimal Settings for Cryptocurrency Trading Pairs

no code implementations • 20 Oct 2022 • Di Zhang, Youzhou Zhou

2) It satisfies the connectivity constraint, that is, all currencies are guaranteed to be tradable.

Management

Paper
Add Code

Modeling Randomly Walking Volatility with Chained Gamma Distributions

no code implementations • 4 Jul 2022 • Di Zhang, Qiang Niu, Youzhou Zhou

2) If the variational inference(VI) is used for state estimation, it runs much faster than Monte Carlo(MC) methods since the calculation of the posterior uses only basic arithmetic operations.

Time Series Analysis Variational Inference

Paper
Add Code

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems

1 code implementation • 11 Apr 2022 • Yuanxing Zhang, Langshi Chen, Siran Yang, Man Yuan, Huimin Yi, Jie Zhang, Jiamang Wang, Jianbo Dong, Yunlong Xu, Yue Song, Yong Li, Di Zhang, Wei Lin, Lin Qu, Bo Zheng

However, we observe that GPU devices in training recommender systems are underutilized, and they cannot attain an expected throughput improvement as what it has achieved in CV and NLP areas.

Marketing Recommendation Systems

149

Paper
Code

AMCAD: Adaptive Mixed-Curvature Representation based Advertisement Retrieval System

no code implementations • 28 Mar 2022 • Zhirong Xu, Shiyang Wen, Junshan Wang, Guojun Liu, Liang Wang, Zhi Yang, Lei Ding, Yan Zhang, Di Zhang, Jian Xu, Bo Zheng

Moreover, to deploy AMCAD in Taobao, one of the largest ecommerce platforms with hundreds of million users, we design an efficient two-layer online retrieval framework for the task of graph based advertisement retrieval.

Graph Embedding Information Retrieval +1

Paper
Add Code

Juvenile state hypothesis: What we can learn from lottery ticket hypothesis researches?

no code implementations • 8 Sep 2021 • Di Zhang

The original lottery ticket hypothesis performs pruning and weight resetting after training convergence, exposing it to the problem of forgotten learning knowledge and potential high cost of training.

Paper
Add Code

M6-T: Exploring Sparse Expert Models and Beyond

no code implementations • 31 May 2021 • An Yang, Junyang Lin, Rui Men, Chang Zhou, Le Jiang, Xianyan Jia, Ang Wang, Jie Zhang, Jiamang Wang, Yong Li, Di Zhang, Wei Lin, Lin Qu, Jingren Zhou, Hongxia Yang

Mixture-of-Experts (MoE) models can achieve promising results with outrageous large amount of parameters but constant computation cost, and thus it has become a trend in model scaling.

Playing the Game of 2048

Paper
Add Code

Graph Intention Network for Click-through Rate Prediction in Sponsored Search

no code implementations • 30 Mar 2021 • Feng Li, Zhenrui Chen, Pengjie Wang, Yi Ren, Di Zhang, Xiaoyu Zhu

Moreover, it is difficult for user to jump out of their specific historical behaviors for possible interest exploration, namely weak generalization problem.

Click-Through Rate Prediction Graph Learning

Paper
Add Code

SCMA Codebook Design Based on Uniquely Decomposable Constellation Groups

1 code implementation • 27 Feb 2021 • Xuewan Zhang, Dalong Zhang, Liuqing Yang, Gangtao Han, Hsiao-Hwa Chen, Di Zhang

Thus, BER performance of the proposed codebook design approach outperforms that of the existing codebook design schemes in both uncoded and coded SCMA systems, especially for large-size codebooks.

Paper
Code

Tunable ferroelectricity in hBN intercalated twisted double-layer graphene

no code implementations • 24 Feb 2021 • Yibo Wang, Siqi Jiang, Jingkuan Xiao, Xiaofan Cai, Di Zhang, Ping Wang, Guodong Ma, Yaqing Han, Jiabei Huang, Kenji Watanabe, Takashi Taniguchi, Alexander S. Mayorov, Geliang Yu

Van der Waals (vdW) assembly of two-dimensional materials has been long recognized as a powerful tool to create unique systems with properties that cannot be found in natural compounds.

Mesoscale and Nanoscale Physics Materials Science

Paper
Add Code

Singlino-dominated dark matter in $Z_3$-NMSSM

no code implementations • 10 Feb 2021 • Haijing Zhou, Junjie Cao, Jingwei Lian, Di Zhang

Approximate analytical formulas describing the dark matter abundance and cross section in the scattering with nucleons are used to illustrate a dependence on theoretical parameters in neutralino and Higgs sectors.

High Energy Physics - Phenomenology

Paper
Add Code

Radiative Decays of Charged Leptons in the Seesaw Effective Field Theory with One-loop Matching

no code implementations • 9 Feb 2021 • Di Zhang, Shun Zhou

For the first time, the Wilson coefficients of all the relevant six-dimensional operators are computed by carrying out the one-loop matching between the effective theory and full seesaw model, and applied to calculate the total rates of radiative decays of charged leptons.

High Energy Physics - Phenomenology High Energy Physics - Experiment

Paper
Add Code

RLScheduler: An Automated HPC Batch Job Scheduler Using Reinforcement Learning

1 code implementation • 20 Oct 2019 • Di Zhang, Dong Dai, Youbiao He, Forrest Sheng Bao, Bing Xie

Today high-performance computing (HPC) platforms are still dominated by batch jobs.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

A bi-diffusion based layer-wise sampling method for deep learning in large graphs

no code implementations • 25 Sep 2019 • Yu He, Shiyang Wen, Wenjin Wu, Yan Zhang, Siran Yang, Yuan Wei, Di Zhang, Guojie Song, Wei Lin, Liang Wang, Bo Zheng

The Graph Convolutional Network (GCN) and its variants are powerful models for graph representation learning and have recently achieved great success on many graph-based applications.

Graph Representation Learning

Paper
Add Code

Projecting "better than randomly": How to reduce the dimensionality of very large datasets in a way that outperforms random projections

no code implementations • 3 Jan 2019 • Michael Wojnowicz, Di Zhang, Glenn Chisholm, Xuan Zhao, Matt Wolff

However, the recent development of randomized principal component analysis (RPCA) has opened up the possibility of obtaining approximate principal components on very large datasets.

Dimensionality Reduction General Classification +1

Paper
Add Code

Gear Training: A new way to implement high-performance model-parallel training

no code implementations • 11 Jun 2018 • Hao Dong, Shuai Li, Dongchang Xu, Yi Ren, Di Zhang

The training of Deep Neural Networks usually needs tremendous computing resources.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.