no code implementations • 25 May 2024 • Jikun Kang, Xin Zhe Li, Xi Chen, Amirreza Kazemi, Boxing Chen
Inspired by findings that LLMs know how to produce right answer but struggle to select the correct reasoning path, we propose a purely inference-based searching method called MindStar (M*), which treats reasoning tasks as search problems.
no code implementations • 31 Dec 2023 • Qianxi Li, Yingyue Cao, Jikun Kang, Tianpei Yang, Xi Chen, Jun Jin, Matthew E. Taylor
Fine-tuning Large Language Models (LLMs) adapts a trained model to specific downstream tasks, significantly improving task-specific performance.
1 code implementation • 24 May 2023 • Jikun Kang, Romain Laroche, Xingdi Yuan, Adam Trischler, Xue Liu, Jie Fu
We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training.
no code implementations • 14 Mar 2023 • Jikun Kang, Di wu, Ju Wang, Ekram Hossain, Xue Liu, Gregory Dudek
In cellular networks, User Equipment (UE) handoff from one Base Station (BS) to another, giving rise to the load balancing problem among the BSs.
1 code implementation • 6 Oct 2021 • Jikun Kang, Miao Liu, Abhinav Gupta, Chris Pal, Xue Liu, Jie Fu
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL).
no code implementations • 29 Sep 2021 • Jikun Kang, Xi Chen, Ju Wang, Chengming Hu, Xue Liu, Gregory Dudek
Results show that, compared with SOTA model-free methods, our method can improve the data efficiency and system performance by up to 75% and 10%, respectively.