no code implementations • 7 Jun 2024 • Yinghui Xia, Yuyan Chen, Tianyu Shi, Jun Wang, Jinsong Yang
Therefore, we construct AICoderEval, a dataset focused on real-world tasks in various domains based on HuggingFace, PyTorch, and TensorFlow, along with comprehensive metrics for evaluation and enhancing LLMs' task-specific code generation capability.
1 code implementation • 7 May 2024 • Ziqi Zhou, Jingyue Zhang, Jingyuan Zhang, Boyue Wang, Tianyu Shi, Alaa Khamis
One of the key challenges in current Reinforcement Learning (RL)-based Automated Driving (AD) agents is achieving flexible, precise, and human-like behavior cost-effectively.
no code implementations • 15 Apr 2024 • Ruoxi Cheng, Haoxuan Ma, Shuirong Cao, Tianyu Shi
Biases and stereotypes in Large Language Models (LLMs) can have negative implications for user experience and societal outcomes.
1 code implementation • 2 Apr 2024 • Xuechen Liang, Meiling Tao, Tianyu Shi, Yiting Xie
Open large language models (LLMs) have significantly advanced the field of natural language processing, showcasing impressive performance across various tasks. Despite the significant advancements in LLMs, their effective operation still relies heavily on human input to accurately guide the dialogue flow, with agent tuning being a crucial optimization technique that involves human adjustments to the model for better response to such guidance. Addressing this dependency, our work introduces the TinyAgent model, trained on a meticulously curated high-quality dataset.
1 code implementation • 2 Apr 2024 • Chen Yang, Aaron Xuxiang Tian, Dong Chen, Tianyu Shi, Arsalan Heydarian
To enhance the scene diversity and stochasticity, the historical trajectory data is first preprocessed and encoded into latent space using Denoising Diffusion Probabilistic Models (DDPM) enhanced with Diffusion with Transformer (DiT) blocks.
1 code implementation • 17 Dec 2023 • Meiling Tao, Xuechen Liang, Tianyu Shi, Lei Yu, Yiting Xie
This study presents RoleCraft-GLM, an innovative framework aimed at enhancing personalized role-playing with Large Language Models (LLMs).
no code implementations • 27 Nov 2023 • Jianxiong Li, Shichao Lin, Tianyu Shi, Chujie Tian, Yu Mei, Jian Song, Xianyuan Zhan, Ruimin Li
Specifically, we combine well-established traffic flow theory with machine learning to construct a reward inference model to infer the reward signals from coarse-grained traffic data.
no code implementations • 2 Jun 2023 • Tianyu Shi, Francois-Xavier Devailly, Denis Larocque, Laurent Charlin
Building upon the state-of-the-art previous model which uses a decentralized approach for large-scale traffic signal control with graph convolutional networks (GCNs), we first learn models using a distributional reinforcement learning (DisRL) approach.
Distributional Reinforcement Learning Multi-agent Reinforcement Learning +2
no code implementations • 16 Dec 2022 • Tianyu Shi, Zhicheng Wang, Liyin Xiao, Cong Liu
Most recent studies on neural constituency parsing focus on encoder structures, while few developments are devoted to decoders.
no code implementations • 3 Nov 2022 • Zhicheng Wang, Tianyu Shi, Cong Liu
In constituency parsing, span-based decoding is an important direction.
no code implementations • 1 Nov 2022 • Zhicheng Wang, Tianyu Shi, Liyin Xiao, Cong Liu
We propose a novel algorithm that improves on the previous neural span-based CKY decoder for constituency parsing.
1 code implementation • 14 Oct 2022 • Xi Chen, Tianyu Shi, Qingpeng Zhao, Yuchen Sun, Yunfei Gao, Xiangjun Wang
It provides realistic 3D environments of variable complexity, various tasks, and multiple modes of interaction, where agents can learn to perceive 3D environments, navigate and plan, compete and cooperate in a human-like manner.
no code implementations • 25 Jun 2022 • Zhiyuan Yao, Tianyu Shi, Site Li, Yiting Xie, Yuanyuan Qin, Xiongjie Xie, Huan Lu, Yan Zhang
Axie infinity is a complicated card game with a huge-scale action space.
no code implementations • 3 Mar 2022 • Tianyu Shi, Yifei Ai, Omar ElSamadisy, Baher Abdulhai
We propose and introduce a Deep Reinforcement Learning (DRL) framework for car following control by integrating bilateral information into both state and reward function based on the bilateral control model (BCM) for car following control.
no code implementations • 6 Jul 2020 • Tianyu Shi, Jiawei Wang, Yuankai Wu, Luis Miranda-Moreno, Lijun Sun
Instead of learning a reliable behavior for ego automated vehicle, we focus on how to improve the outcomes of the total transportation system by allowing each automated vehicle to learn cooperation with each other and regulate human-driven traffic flow.
no code implementations • 23 Apr 2019 • Tianyu Shi, Pin Wang, Xuxin Cheng, Ching-Yao Chan, Ding Huang
We apply Deep Q-network (DQN) with the consideration of safety during the task for deciding whether to conduct the maneuver.
1 code implementation • 18 Apr 2019 • Chenyang Xi, Tianyu Shi, Yuankai Wu, Lijun Sun
Traditional motion planning methods suffer from several drawbacks in terms of optimality, efficiency and generalization capability.
no code implementations • 31 Jan 2019 • Tianyu Shi, Pin Wang, Ching-Yao Chan, Chonghao Zou
A reliable controller is critical and essential for the execution of safe and smooth maneuvers of an autonomous vehicle. The controller must be robust to external disturbances, such as road surface, weather, and wind conditions, and so on. It also needs to deal with the internal parametric variations of vehicle sub-systems, including power-train efficiency, measurement errors, time delay, so on. Moreover, as in most production vehicles, the low-control commands for the engine, brake, and steering systems are delivered through separate electronic control units. These aforementioned factors introduce opaque and ineffectiveness issues in controller performance. In this paper, we design a feed-forward compensate process via a data-driven method to model and further optimize the controller performance. We apply the principal component analysis to the extraction of most influential features. Subsequently, we adopt a time delay neural network and include the accuracy of the predicted error in a future time horizon. Utilizing the predicted error, we then design a feed-forward compensate process to improve the control performance. Finally, we demonstrate the effectiveness of the proposed feed-forward compensate process in simulation scenarios.