no code implementations • 17 Apr 2024 • Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Luis Rosias, Stephanie Chan, Biao Zhang, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle
Finally, we demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases, can learn high-dimensional functions with numerical inputs, and performs comparably to fine-tuning.
no code implementations • 11 Dec 2023 • Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel
To do so, we investigate a simple self-training method based on expectation-maximization, which we call ReST$^{EM}$, where we (1) generate samples from the model and filter them using binary feedback, (2) fine-tune the model on these samples, and (3) repeat this process a few times.
no code implementations • 16 Oct 2023 • Yixin Liu, Avi Singh, C. Daniel Freeman, John D. Co-Reyes, Peter J. Liu
With these methods, we present a thorough empirical study on a series of PaLM 2 models and find: (1) The quality and style of the step-by-step solutions used for fine-tuning can make a significant impact on the model performance; (2) While solution re-ranking and majority voting are both effective for improving the model performance when used separately, they can also be used together for an even greater performance boost; (3) Multi-task fine-tuning that sequentially separates the solution generation and evaluation tasks can offer improved performance compared with the solution fine-tuning baseline.
no code implementations • 6 Sep 2023 • David B. D'Ambrosio, Jonathan Abelian, Saminda Abeyruwan, Michael Ahn, Alex Bewley, Justin Boyd, Krzysztof Choromanski, Omar Cortes, Erwin Coumans, Tianli Ding, Wenbo Gao, Laura Graesser, Atil Iscen, Navdeep Jaitly, Deepali Jain, Juhana Kangaspunta, Satoshi Kataoka, Gus Kouretas, Yuheng Kuang, Nevena Lazic, Corey Lynch, Reza Mahjourian, Sherry Q. Moore, Thinh Nguyen, Ken Oslund, Barney J Reed, Krista Reymann, Pannag R. Sanketi, Anish Shankar, Pierre Sermanet, Vikas Sindhwani, Avi Singh, Vincent Vanhoucke, Grace Vesom, Peng Xu
We present a deep-dive into a real-world robotic learning system that, in previous work, was shown to be capable of hundreds of table tennis rallies with a human and has the ability to precisely return the ball to desired targets.
no code implementations • 5 Oct 2022 • David Brandfonbrener, Stephen Tu, Avi Singh, Stefan Welker, Chad Boodoo, Nikolai Matni, Jake Varley
We find that by adjusting the data collection process we improve the quality of both the learned value functions and policies over a variety of baseline methods for data collection.
no code implementations • 11 Jul 2022 • Homer Walke, Jonathan Yang, Albert Yu, Aviral Kumar, Jedrzej Orbik, Avi Singh, Sergey Levine
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems.
no code implementations • ICLR 2021 • Avi Singh, Huihan Liu, Gaoyue Zhou, Albert Yu, Nicholas Rhinehart, Sergey Levine
Reinforcement learning provides a general framework for flexible decision making and control, but requires extensive data collection for each new task that an agent needs to learn.
1 code implementation • 27 Oct 2020 • Avi Singh, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, Sergey Levine
Reinforcement learning has been applied to a wide variety of robotics problems, but most of such applications involve collecting data from scratch for each new task.
no code implementations • ICLR 2020 • Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine
The success of reinforcement learning in the real world has been limited to instrumented laboratory scenarios, often requiring arduous human supervision to enable continuous learning.
no code implementations • 27 Apr 2020 • Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine
In this work, we discuss the elements that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world.
no code implementations • 25 Feb 2020 • Avi Singh, Eric Jang, Alexander Irpan, Daniel Kappler, Murtaza Dalal, Sergey Levine, Mohi Khansari, Chelsea Finn
In this work, we target this challenge, aiming to build an imitation learning system that can continuously improve through autonomous data collection, while simultaneously avoiding the explicit use of reinforcement learning, to maintain the stability, simplicity, and scalability of supervised imitation.
3 code implementations • 16 Apr 2019 • Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine
In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task.
no code implementations • 30 Sep 2018 • Annie Xie, Avi Singh, Sergey Levine, Chelsea Finn
To that end, we formulate the few-shot objective learning problem, where the goal is to learn a task objective from only a few example images of successful end states for that task.
no code implementations • NeurIPS 2018 • Justin Fu, Avi Singh, Dibya Ghosh, Larry Yang, Sergey Levine
We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available.
1 code implementation • ICLR 2018 • Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine
In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice.
no code implementations • ICCV 2017 • Avi Singh, Larry Yang, Sergey Levine
We show that pairing interaction data from just a single environment with a diverse dataset of weakly labeled data results in greatly improved generalization to unseen environments, and show that this generalization depends on both the auxiliary objective and the attentional architecture that we propose.
11 code implementations • CVPR 2017 • Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M. F. Moura, Devi Parikh, Dhruv Batra
We introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content.
Ranked #15 on Visual Dialog on VisDial v0.9 val
no code implementations • 5 Jan 2016 • Ashesh Jain, Hema S. Koppula, Shane Soh, Bharad Raghavan, Avi Singh, Ashutosh Saxena
We introduce a diverse data set with 1180 miles of natural freeway and city driving, and show that we can anticipate maneuvers 3. 5 seconds before they occur in real-time with a precision and recall of 90. 5\% and 87. 4\% respectively.
no code implementations • 16 Sep 2015 • Ashesh Jain, Avi Singh, Hema S. Koppula, Shane Soh, Ashutosh Saxena
We introduce a sensory-fusion architecture which jointly learns to anticipate and fuse information from multiple sensory streams.