Search Results for author: Jikun Kang

Found 6 papers, 2 papers with code

MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time

no code implementations • 25 May 2024 • Jikun Kang, Xin Zhe Li, Xi Chen, Amirreza Kazemi, Boxing Chen

Inspired by findings that LLMs know how to produce right answer but struggle to select the correct reasoning path, we propose a purely inference-based searching method called MindStar (M*), which treats reasoning tasks as search problems.

GSM8K Math +1

Paper
Add Code

LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning Language Models

no code implementations • 31 Dec 2023 • Qianxi Li, Yingyue Cao, Jikun Kang, Tianpei Yang, Xi Chen, Jun Jin, Matthew E. Taylor

Fine-tuning Large Language Models (LLMs) adapts a trained model to specific downstream tasks, significantly improving task-specific performance.

Question Answering

Paper
Add Code

Think Before You Act: Decision Transformers with Working Memory

1 code implementation • 24 May 2023 • Jikun Kang, Romain Laroche, Xingdi Yuan, Adam Trischler, Xue Liu, Jie Fu

We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training.

Atari Games Decision Making +2

Paper
Code

Multi-agent Attention Actor-Critic Algorithm for Load Balancing in Cellular Networks

no code implementations • 14 Mar 2023 • Jikun Kang, Di wu, Ju Wang, Ekram Hossain, Xue Liu, Gregory Dudek

In cellular networks, User Equipment (UE) handoff from one Base Station (BS) to another, giving rise to the load balancing problem among the BSs.

Paper
Add Code

Learning Multi-Objective Curricula for Robotic Policy Learning

1 code implementation • 6 Oct 2021 • Jikun Kang, Miao Liu, Abhinav Gupta, Chris Pal, Xue Liu, Jie Fu

Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL).

Reinforcement Learning (RL)

Paper
Code

MOBA: Multi-teacher Model Based Reinforcement Learning

no code implementations • 29 Sep 2021 • Jikun Kang, Xi Chen, Ju Wang, Chengming Hu, Xue Liu, Gregory Dudek

Results show that, compared with SOTA model-free methods, our method can improve the data efficiency and system performance by up to 75% and 10%, respectively.

Decision Making Knowledge Distillation +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.