Search Results for author: Sergey Levine

Found 498 papers, 234 papers with code

Global Decision-Making via Local Economic Transactions

no code implementations • ICML 2020 • Michael Chang, Sid Kaushik, S. Matthew Weinberg, Sergey Levine, Thomas Griffiths

This paper seeks to establish a mechanism for directing a collection of simple, specialized, self-interested agents to solve what traditionally are posed as monolithic single-agent sequential decision problems with a central global objective.

Decision Making

Paper
Add Code

Strategically Conservative Q-Learning

1 code implementation • 6 Jun 2024 • Yutaka Shimizu, Joey Hong, Sergey Levine, Masayoshi Tomizuka

In this paper, we propose a novel framework called Strategically Conservative Q-Learning (SCQ) that distinguishes between OOD data that is easy and hard to estimate, ultimately resulting in less conservative value estimates.

Paper
Code

Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models

no code implementations • 30 May 2024 • Masatoshi Uehara, Yulai Zhao, Ehsan Hajiramezanali, Gabriele Scalia, Gökcen Eraslan, Avantika Lal, Sergey Levine, Tommaso Biancalani

To combine the strengths of both approaches, we adopt a hybrid method that fine-tunes cutting-edge diffusion models by optimizing reward models through RL.

Paper
Add Code

Octo: An Open-Source Generalist Robot Policy

no code implementations • 20 May 2024 • Octo Model Team, Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, Jianlan Luo, You Liang Tan, Lawrence Yunliang Chen, Pannag Sanketi, Quan Vuong, Ted Xiao, Dorsa Sadigh, Chelsea Finn, Sergey Levine

In experiments across 9 robotic platforms, we demonstrate that Octo serves as a versatile policy initialization that can be effectively finetuned to new observation and action spaces.

Robot Manipulation

Paper
Add Code

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

no code implementations • 16 May 2024 • Yuexiang Zhai, Hao Bai, Zipeng Lin, Jiayi Pan, Shengbang Tong, Yifei Zhou, Alane Suhr, Saining Xie, Yann Lecun, Yi Ma, Sergey Levine

Finally, our framework uses these task rewards to fine-tune the entire VLM with RL.

Decision Making Reinforcement Learning (RL) +1

Paper
Add Code

Evaluating Real-World Robot Manipulation Policies in Simulation

1 code implementation • 9 May 2024 • Xuanlin Li, Kyle Hsu, Jiayuan Gu, Karl Pertsch, Oier Mees, Homer Rich Walke, Chuyuan Fu, Ishikaa Lunawat, Isabel Sieh, Sean Kirmani, Sergey Levine, Jiajun Wu, Chelsea Finn, Hao Su, Quan Vuong, Ted Xiao

We then employ these approaches to create SIMPLER, a collection of simulated environments for manipulation policy evaluation on common real robot setups.

Robotic Grasping Robot Manipulation Generalization

104

Paper
Code

RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes

no code implementations • 7 May 2024 • Kyle Stachowicz, Sergey Levine

The high-speed off-road driving task represents a particularly challenging instantiation of this problem: a high-return policy should drive as aggressively and as quickly as possible, which often requires getting close to the edge of the set of "safe" states, and therefore places a particular burden on the method to avoid frequent failures.

reinforcement-learning

Paper
Add Code

Learning Visuotactile Skills with Two Multifingered Hands

1 code implementation • 25 Apr 2024 • Toru Lin, Yu Zhang, Qiyang Li, Haozhi Qi, Brent Yi, Sergey Levine, Jitendra Malik

Two significant challenges exist: the lack of an affordable and accessible teleoperation system suitable for a dual-arm setup with multifingered hands, and the scarcity of multifingered hand hardware equipped with touch sensing.

Paper
Code

Autonomous Evaluation and Refinement of Digital Agents

1 code implementation • 9 Apr 2024 • Jiayi Pan, Yichi Zhang, Nicholas Tomlin, Yifei Zhou, Sergey Levine, Alane Suhr

We show that domain-general automatic evaluators can significantly improve the performance of agents for web navigation and device control.

Paper
Code

Yell At Your Robot: Improving On-the-Fly from Language Corrections

no code implementations • 19 Mar 2024 • Lucy Xiaoyang Shi, Zheyuan Hu, Tony Z. Zhao, Archit Sharma, Karl Pertsch, Jianlan Luo, Sergey Levine, Chelsea Finn

In this paper, we make the following observation: high-level policies that index into sufficiently rich and expressive low-level language-conditioned skills can be readily supervised with human feedback in the form of language corrections.

Paper
Add Code

Unfamiliar Finetuning Examples Control How Language Models Hallucinate

1 code implementation • 8 Mar 2024 • Katie Kang, Eric Wallace, Claire Tomlin, Aviral Kumar, Sergey Levine

Leveraging our previous observations on controlling hallucinations, we propose an approach for learning more reliable reward models, and show that they improve the efficacy of RL factuality finetuning in long-form biography and book/movie plot generation tasks.

Multiple-choice TriviaQA

Paper
Code

Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference

1 code implementation • 6 Mar 2024 • Benjamin Eysenbach, Vivek Myers, Ruslan Salakhutdinov, Sergey Levine

The key idea is to apply a variant of contrastive learning to time series data.

Contrastive Learning Time Series

Paper
Code

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

no code implementations • 6 Mar 2024 • Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal

Observing this discrepancy, in this paper, we investigate whether the scalability of deep RL can also be improved simply by using classification in place of regression for training value functions.

Atari Games regression +1

Paper
Add Code

MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting

no code implementations • 5 Mar 2024 • Fangchen Liu, Kuan Fang, Pieter Abbeel, Sergey Levine

In this paper, we present MOKA (Marking Open-vocabulary Keypoint Affordances), an approach that employs VLMs to solve robotic manipulation tasks specified by free-form language descriptions.

In-Context Learning Question Answering +2

Paper
Add Code

SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation

no code implementations • 1 Mar 2024 • Noriaki Hirose, Dhruv Shah, Kyle Stachowicz, Ajay Sridhar, Sergey Levine

Specifically, SELFI stabilizes the online learning process by incorporating the same model-based learning objective from offline pre-training into the Q-values learned with online model-free reinforcement learning.

Collision Avoidance reinforcement-learning +2

Paper
Add Code

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

1 code implementation • 29 Feb 2024 • Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar

In this paper, we develop a framework for building multi-turn RL algorithms for fine-tuning LLMs, that preserves the flexibility of existing single-turn RL methods for LLMs (e. g., proximal policy optimization), while accommodating multiple turns, long horizons, and delayed rewards effectively.

Language Modelling Reinforcement Learning (RL)

Paper
Code

Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings

1 code implementation • 27 Feb 2024 • Kevin Frans, Seohong Park, Pieter Abbeel, Sergey Levine

Can we pre-train a generalist agent from a large amount of unlabeled offline trajectories such that it can be immediately adapted to any new downstream tasks in a zero-shot manner?

Offline RL reinforcement-learning

Paper
Code

Feedback Efficient Online Fine-Tuning of Diffusion Models

no code implementations • 26 Feb 2024 • Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali, Gabriele Scalia, Nathaniel Lee Diamant, Alex M Tseng, Sergey Levine, Tommaso Biancalani

It is natural to frame this as a reinforcement learning (RL) problem, in which the objective is to fine-tune a diffusion model to maximize a reward function that corresponds to some property.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Foundation Policies with Hilbert Representations

1 code implementation • 23 Feb 2024 • Seohong Park, Tobias Kreiman, Sergey Levine

While a number of methods have been proposed to enable generic self-supervised RL, based on principles such as goal-conditioned RL, behavioral cloning, and unsupervised skill learning, such methods remain limited in terms of either the diversity of the discovered behaviors, the need for high-quality demonstration data, or the lack of a clear adaptation mechanism for downstream tasks.

Reinforcement Learning (RL) Unsupervised Pre-training

Paper
Code

Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control

no code implementations • 23 Feb 2024 • Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali, Gabriele Scalia, Nathaniel Lee Diamant, Alex M Tseng, Tommaso Biancalani, Sergey Levine

Diffusion models excel at capturing complex data distributions, such as those of natural images and proteins.

Paper
Add Code

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

no code implementations • 12 Feb 2024 • Soroush Nasiriany, Fei Xia, Wenhao Yu, Ted Xiao, Jacky Liang, Ishita Dasgupta, Annie Xie, Danny Driess, Ayzaan Wahid, Zhuo Xu, Quan Vuong, Tingnan Zhang, Tsang-Wei Edward Lee, Kuang-Huei Lee, Peng Xu, Sean Kirmani, Yuke Zhu, Andy Zeng, Karol Hausman, Nicolas Heess, Chelsea Finn, Sergey Levine, Brian Ichter

In each iteration, the image is annotated with a visual representation of proposals that the VLM can refer to (e. g., candidate robot actions, localizations, or trajectories).

Instruction Following Logical Reasoning +3

Paper
Add Code

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

no code implementations • 5 Feb 2024 • William Chen, Oier Mees, Aviral Kumar, Sergey Levine

We find that our policies trained on embeddings from off-the-shelf, general-purpose VLMs outperform equivalent policies trained on generic, non-promptable image embeddings.

Common Sense Reasoning Instruction Following +4

Paper
Add Code

Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

no code implementations • 30 Jan 2024 • Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

no code implementations • 29 Jan 2024 • Jianlan Luo, Zheyuan Hu, Charles Xu, You Liang Tan, Jacob Berg, Archit Sharma, Stefan Schaal, Chelsea Finn, Abhishek Gupta, Sergey Levine

We posit that a significant challenge to widespread adoption of robotic RL, as well as further development of robotic RL methods, is the comparative inaccessibility of such methods.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

no code implementations • 23 Jan 2024 • Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Sean Kirmani, Edward Lee, Sergey Levine, Yao Lu, Isabel Leal, Sharath Maddineni, Kanishka Rao, Dorsa Sadigh, Pannag Sanketi, Pierre Sermanet, Quan Vuong, Stefan Welker, Fei Xia, Ted Xiao, Peng Xu, Steve Xu, Zhuo Xu

We experimentally show that such "in-the-wild" data collected by AutoRT is significantly more diverse, and that AutoRT's use of LLMs allows for instruction following data collection robots that can align to human preferences.

Instruction Following Scene Understanding

Paper
Add Code

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization

no code implementations • 8 Jan 2024 • Jakub Grudzien Kuba, Masatoshi Uehara, Pieter Abbeel, Sergey Levine

This kind of data-driven optimization (DDO) presents a range of challenges beyond those in standard prediction problems, since we need models that successfully predict the performance of new designs that are better than the best designs seen in the training set.

Paper
Add Code

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

no code implementations • 7 Dec 2023 • Chengshu Li, Jacky Liang, Andy Zeng, Xinyun Chen, Karol Hausman, Dorsa Sadigh, Sergey Levine, Li Fei-Fei, Fei Xia, Brian Ichter

For example, consider prompting an LM to write code that counts the number of times it detects sarcasm in an essay: the LM may struggle to write an implementation for "detect_sarcasm(string)" that can be executed by the interpreter (handling the edge cases would be insurmountable).

Language Modelling

Paper
Add Code

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

1 code implementation • 30 Nov 2023 • Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine

Developing such algorithms requires tasks that can gauge progress on algorithm design, provide accessible and reproducible evaluations for multi-turn interactions, and cover a range of task properties and challenges in improving reinforcement learning algorithms.

reinforcement-learning Text Generation

Paper
Code

RLIF: Interactive Imitation Learning as Reinforcement Learning

no code implementations • 21 Nov 2023 • Jianlan Luo, Perry Dong, Yuexiang Zhai, Yi Ma, Sergey Levine

We also provide a unified framework to analyze our RL method and DAgger; for which we present the asymptotic analysis of the suboptimal gap for both methods as well as the non-asymptotic sample complexity bound of our method.

Continuous Control Imitation Learning +1

Paper
Add Code

Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations

no code implementations • 9 Nov 2023 • Joey Hong, Sergey Levine, Anca Dragan

LLMs trained with supervised fine-tuning or "single-step" RL, as with standard RLHF, might struggle which tasks that require such goal-directed behavior, since they are not trained to optimize for overall conversational outcomes after multiple turns of interaction.

Text Generation

Paper
Add Code

Accelerating Exploration with Unlabeled Prior Data

1 code implementation • NeurIPS 2023 • Qiyang Li, Jason Zhang, Dibya Ghosh, Amy Zhang, Sergey Levine

Learning to solve tasks from a sparse reward signal is a major challenge for standard reinforcement learning (RL) algorithms.

Reinforcement Learning (RL)

Paper
Code

Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment

no code implementations • 2 Nov 2023 • Annie S. Chen, Govind Chada, Laura Smith, Archit Sharma, Zipeng Fu, Sergey Levine, Chelsea Finn

We provide theoretical analysis of our selection mechanism and demonstrate that ROAM enables a robot to adapt rapidly to changes in dynamics both in simulation and on a real Go1 quadruped, even successfully moving forward with roller skates on its feet.

Paper
Add Code

Offline RL with Observation Histories: Analyzing and Improving Sample Complexity

no code implementations • 31 Oct 2023 • Joey Hong, Anca Dragan, Sergey Levine

Theoretically, we show that standard offline RL algorithms conditioned on observation histories suffer from poor sample complexity, in accordance with the above intuition.

Autonomous Navigation Offline RL +1

Paper
Add Code

Grow Your Limits: Continuous Improvement with Real-World RL for Robotic Locomotion

no code implementations • 26 Oct 2023 • Laura Smith, YunHao Cao, Sergey Levine

Deep reinforcement learning (RL) can enable robots to autonomously acquire complex behaviors, such as legged locomotion.

Efficient Exploration Reinforcement Learning (RL)

Paper
Add Code

Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning

no code implementations • 18 Oct 2023 • Jianlan Luo, Perry Dong, Jeffrey Wu, Aviral Kumar, Xinyang Geng, Sergey Levine

We use a VQ-VAE to learn state-conditioned action quantization, avoiding the exponential blowup that comes with na\"ive discretization of the action space.

Offline RL Quantization +2

Paper
Add Code

Navigation with Large Language Models: Semantic Guesswork as a Heuristic for Planning

no code implementations • 16 Oct 2023 • Dhruv Shah, Michael Equi, Blazej Osinski, Fei Xia, Brian Ichter, Sergey Levine

Navigation in unfamiliar environments presents a major challenge for robots: while mapping and planning techniques can be used to build up a representation of the world, quickly discovering a path to a desired goal in unfamiliar settings with such methods often requires lengthy mapping and exploration.

Language Modelling Navigate

Paper
Add Code

Latent Conservative Objective Models for Data-Driven Crystal Structure Prediction

no code implementations • 16 Oct 2023 • Han Qi, Xinyang Geng, Stefano Rando, Iku Ohama, Aviral Kumar, Sergey Levine

In computational chemistry, crystal structure prediction (CSP) is an optimization problem that involves discovering the lowest energy stable crystal structure for a given chemical formula.

Formation Energy

Paper
Add Code

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction

1 code implementation • 13 Oct 2023 • Seohong Park, Oleh Rybkin, Sergey Levine

Through our experiments in five locomotion and manipulation environments, we demonstrate that METRA can discover a variety of useful behaviors even in complex, pixel-based environments, being the first unsupervised RL method that discovers diverse locomotion behaviors in pixel-based Quadruped and Humanoid.

Reinforcement Learning (RL) Unsupervised Pre-training +1

Paper
Code

Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias

1 code implementation • 12 Oct 2023 • Max Sobol Mark, Archit Sharma, Fahim Tajwar, Rafael Rafailov, Sergey Levine, Chelsea Finn

Can we leverage offline RL to recover better policies from online interaction?

D4RL Offline RL +2

Paper
Code

NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration

no code implementations • 11 Oct 2023 • Ajay Sridhar, Dhruv Shah, Catherine Glossop, Sergey Levine

In this paper, we describe how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration, with the latter providing the ability to search novel environments, and the former providing the ability to reach a user-specified goal once it has been located.

Paper
Add Code

Deep Neural Networks Tend To Extrapolate Predictably

1 code implementation • 2 Oct 2023 • Katie Kang, Amrith Setlur, Claire Tomlin, Sergey Levine

Rather than extrapolating in arbitrary ways, we observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.

Decision Making

Paper
Code

Robotic Offline RL from Internet Videos via Value-Function Pre-Training

no code implementations • 22 Sep 2023 • Chethan Bhateja, Derek Guo, Dibya Ghosh, Anikait Singh, Manan Tomar, Quan Vuong, Yevgen Chebotar, Sergey Levine, Aviral Kumar

Our system, called V-PTR, combines the benefits of pre-training on video data with robotic offline RL approaches that train on diverse robot data, resulting in value functions and policies for manipulation tasks that perform better, act robustly, and generalize broadly.

Offline RL Reinforcement Learning (RL)

Paper
Add Code

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

no code implementations • 18 Sep 2023 • Yevgen Chebotar, Quan Vuong, Alex Irpan, Karol Hausman, Fei Xia, Yao Lu, Aviral Kumar, Tianhe Yu, Alexander Herzog, Karl Pertsch, Keerthana Gopalakrishnan, Julian Ibarz, Ofir Nachum, Sumedh Sontakke, Grecia Salazar, Huong T Tran, Jodilyn Peralta, Clayton Tan, Deeksha Manjunath, Jaspiar Singht, Brianna Zitkovich, Tomas Jackson, Kanishka Rao, Chelsea Finn, Sergey Levine

In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data.

Imitation Learning Offline RL +2

Paper
Add Code

Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement Learning

no code implementations • 7 Sep 2023 • Jensen Gao, Siddharth Reddy, Glen Berseth, Anca D. Dragan, Sergey Levine

We further evaluate on a simulated Sawyer pushing task with eye gaze control, and the Lunar Lander game with simulated user commands, and find that our method improves over baseline interfaces in these domains as well.

Brain Computer Interface Decision Making +1

Paper
Add Code

REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation

no code implementations • 6 Sep 2023 • Zheyuan Hu, Aaron Rovinsky, Jianlan Luo, Vikash Kumar, Abhishek Gupta, Sergey Levine

We demonstrate the benefits of reusing past data as replay buffer initialization for new tasks, for instance, the fast acquisition of intricate manipulation skills in the real world on a four-fingered robotic hand.

Imitation Learning Reinforcement Learning (RL)

Paper
Add Code

BridgeData V2: A Dataset for Robot Learning at Scale

1 code implementation • 24 Aug 2023 • Homer Walke, Kevin Black, Abraham Lee, Moo Jin Kim, Max Du, Chongyi Zheng, Tony Zhao, Philippe Hansen-Estruch, Quan Vuong, Andre He, Vivek Myers, Kuan Fang, Chelsea Finn, Sergey Levine

By publicly sharing BridgeData V2 and our pre-trained models, we aim to accelerate research in scalable robot learning methods.

Imitation Learning Multi-Task Learning

Paper
Code

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

1 code implementation • 28 Jul 2023 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web.

Object Question Answering +1

292

Paper
Code

A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning

1 code implementation • 24 Jul 2023 • Benjamin Eysenbach, Matthieu Geist, Sergey Levine, Ruslan Salakhutdinov

One-step methods perform regularization by doing just a single step of policy improvement, while critic regularization methods do many steps of policy improvement with a regularized objective.

Offline RL reinforcement-learning

Paper
Code

Contrastive Example-Based Control

1 code implementation • 24 Jul 2023 • Kyle Hatch, Benjamin Eysenbach, Rafael Rafailov, Tianhe Yu, Ruslan Salakhutdinov, Sergey Levine, Chelsea Finn

In this paper, we propose a method for offline, example-based control that learns an implicit model of multi-step transitions, rather than a reward function.

Offline RL

Paper
Code

HIQL: Offline Goal-Conditioned RL with Latent States as Actions

1 code implementation • NeurIPS 2023 • Seohong Park, Dibya Ghosh, Benjamin Eysenbach, Sergey Levine

This structure can be very useful, as assessing the quality of actions for nearby goals is typically easier than for more distant goals.

Reinforcement Learning (RL) Unsupervised Pre-training

Paper
Code

Multi-Stage Cable Routing through Hierarchical Imitation Learning

no code implementations • 18 Jul 2023 • Jianlan Luo, Charles Xu, Xinyang Geng, Gilbert Feng, Kuan Fang, Liam Tan, Stefan Schaal, Sergey Levine

In such settings, learning individual primitives for each stage that succeed with a high enough rate to perform a complete temporally extended task is impractical: if each stage must be completed successfully and has a non-negligible probability of failure, the likelihood of successful completion of the entire task becomes negligible.

Imitation Learning

Paper
Add Code

Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control

no code implementations • 30 Jun 2023 • Vivek Myers, Andre He, Kuan Fang, Homer Walke, Philippe Hansen-Estruch, Ching-An Cheng, Mihai Jalobeanu, Andrey Kolobov, Anca Dragan, Sergey Levine

Our method achieves robust performance in the real world by learning an embedding from the labeled data that aligns language not to the goal image, but rather to the desired change between the start and goal images that the instruction corresponds to.

Instruction Following

Paper
Add Code

ViNT: A Foundation Model for Visual Navigation

no code implementations • 26 Jun 2023 • Dhruv Shah, Ajay Sridhar, Nitish Dashora, Kyle Stachowicz, Kevin Black, Noriaki Hirose, Sergey Levine

In this paper, we describe the Visual Navigation Transformer (ViNT), a foundation model that aims to bring the success of general-purpose pre-trained models to vision-based robotic navigation.

Visual Navigation

Paper
Add Code

Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts

no code implementations • 19 Jun 2023 • Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn

Effective machine learning models learn both robust features that directly determine the outcome of interest (e. g., an object with wheels is more likely to be a car), and shortcut features (e. g., an object on a road is more likely to be a car).

Model Selection

Paper
Add Code

Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data

1 code implementation • 6 Jun 2023 • Chongyi Zheng, Benjamin Eysenbach, Homer Walke, Patrick Yin, Kuan Fang, Ruslan Salakhutdinov, Sergey Levine

Robotic systems that rely primarily on self-supervised learning have the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.

Contrastive Learning Data Augmentation +2

Paper
Code

SACSoN: Scalable Autonomous Control for Social Navigation

no code implementations • 2 Jun 2023 • Noriaki Hirose, Dhruv Shah, Ajay Sridhar, Sergey Levine

By minimizing this counterfactual perturbation, we can induce robots to behave in ways that do not alter the natural behavior of humans in the shared space.

Continual Learning counterfactual +3

Paper
Add Code

The False Promise of Imitating Proprietary LLMs

1 code implementation • 25 May 2023 • Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao liu, Pieter Abbeel, Sergey Levine, Dawn Song

This approach looks to cheaply imitate the proprietary model's capabilities using a weaker open-source model.

Language Modelling

1,091

Paper
Code

Training Diffusion Models with Reinforcement Learning

2 code implementations • 22 May 2023 • Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine

However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives such as human-perceived image quality or drug effectiveness.

Decision Making Denoising +2

348

Paper
Code

Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware

no code implementations • 23 Apr 2023 • Tony Z. Zhao, Vikash Kumar, Sergey Levine, Chelsea Finn

Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback.

Chunking Imitation Learning

Paper
Add Code

IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies

1 code implementation • 20 Apr 2023 • Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine

In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.

Offline RL Q-Learning

Paper
Code

Efficient Deep Reinforcement Learning Requires Regulating Overfitting

no code implementations • 20 Apr 2023 • Qiyang Li, Aviral Kumar, Ilya Kostrikov, Sergey Levine

Deep reinforcement learning algorithms that learn policies by trial-and-error must learn from limited amounts of data collected by actively interacting with the environment.

Model Selection reinforcement-learning

Paper
Add Code

Learning and Adapting Agile Locomotion Skills by Transferring Experience

no code implementations • 19 Apr 2023 • Laura Smith, J. Chase Kew, Tianyu Li, Linda Luu, Xue Bin Peng, Sehoon Ha, Jie Tan, Sergey Levine

Legged robots have enormous potential in their range of capabilities, from navigating unstructured terrains to high-speed running.

Reinforcement Learning (RL)

Paper
Add Code

FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing

no code implementations • 19 Apr 2023 • Kyle Stachowicz, Dhruv Shah, Arjun Bhorkar, Ilya Kostrikov, Sergey Levine

We present a system that enables an autonomous small-scale RC car to drive aggressively from visual observations using reinforcement learning (RL).

Reinforcement Learning (RL)

Paper
Add Code

Reinforcement Learning from Passive Data via Latent Intentions

1 code implementation • 10 Apr 2023 • Dibya Ghosh, Chethan Bhateja, Sergey Levine

Passive observational data, such as human videos, is abundant and rich in information, yet remains largely untapped by current RL methods.

reinforcement-learning Value prediction

Paper
Code

Neural Constraint Satisfaction: Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement

no code implementations • 20 Mar 2023 • Michael Chang, Alyssa L. Dayan, Franziska Meier, Thomas L. Griffiths, Sergey Levine, Amy Zhang

Object rearrangement is a challenge for embodied agents because solving these tasks requires generalizing across a combinatorially large set of configurations of entities and their locations.

Paper
Add Code

Ignorance is Bliss: Robust Control via Information Gating

no code implementations • NeurIPS 2023 • Manan Tomar, Riashat Islam, Matthew E. Taylor, Sergey Levine, Philip Bachman

We propose \textit{information gating} as a way to learn parsimonious representations that identify the minimal information required for a task.

Inductive Bias Q-Learning

Paper
Add Code

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

3 code implementations • NeurIPS 2023 • Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine

Our approach, calibrated Q-learning (Cal-QL), accomplishes this by learning a conservative value function initialization that underestimates the value of the learned policy from offline data, while also being calibrated, in the sense that the learned Q-values are at a reasonable scale.

Offline RL Q-Learning +1

1,231

Paper
Code

PaLM-E: An Embodied Multimodal Language Model

2 code implementations • 6 Mar 2023 • Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence

Large language models excel at a wide range of complex tasks.

Ranked #2 on Visual Question Answering (VQA) on OK-VQA

Language Modelling Large Language Model +2

220

Paper
Code

Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents

no code implementations • NeurIPS 2023 • Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, Brian Ichter

Recent progress in large language models (LLMs) has demonstrated the ability to learn and leverage Internet-scale knowledge through pre-training with autoregressive models.

Language Modelling Text Generation

Paper
Add Code

Robust and Versatile Bipedal Jumping Control through Reinforcement Learning

no code implementations • 19 Feb 2023 • Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Project and Probe: Sample-Efficient Domain Adaptation by Interpolating Orthogonal Features

no code implementations • 10 Feb 2023 • Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn

Transfer learning with a small amount of target data is an effective and common approach to adapting a pre-trained model to distribution shifts.

Domain Adaptation Transfer Learning

Paper
Add Code

Predictable MDP Abstraction for Unsupervised Model-Based RL

2 code implementations • 8 Feb 2023 • Seohong Park, Sergey Levine

A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Paper
Code

Efficient Online Reinforcement Learning with Offline Data

1 code implementation • 6 Feb 2023 • Philip J. Ball, Laura Smith, Ilya Kostrikov, Sergey Levine

Sample efficiency and exploration remain major challenges in online reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

182

Paper
Code

Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts

1 code implementation • 6 Feb 2023 • Amrith Setlur, Don Dennis, Benjamin Eysenbach, aditi raghunathan, Chelsea Finn, Virginia Smith, Sergey Levine

Some robust training algorithms (e. g., Group DRO) specialize to group shifts and require group information on all training points.

Paper
Code

Understanding the Complexity Gains of Single-Task RL with a Curriculum

no code implementations • 24 Dec 2022 • Qiyang Li, Yuexiang Zhai, Yi Ma, Sergey Levine

Under mild regularity conditions on the curriculum, we show that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem, without any explicit exploration bonuses or other exploration strategies.

Reinforcement Learning (RL)

Paper
Add Code

Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios

no code implementations • 21 Dec 2022 • Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Rebecca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine

To our knowledge, this is the first application of a combined imitation and reinforcement learning approach in autonomous driving that utilizes large amounts of real-world human driving data.

Autonomous Driving Imitation Learning +2

Paper
Add Code

Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance

no code implementations • 19 Dec 2022 • Kelvin Xu, Zheyuan Hu, Ria Doshi, Aaron Rovinsky, Vikash Kumar, Abhishek Gupta, Sergey Levine

In this paper, we describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks and enable robots with complex multi-fingered hands to learn to perform them through interaction.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Offline Reinforcement Learning for Visual Navigation

1 code implementation • 16 Dec 2022 • Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nick Rhinehart, Sergey Levine

Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass.

Navigate Offline RL +3

Paper
Code

Learning Robotic Navigation from Experience: Principles, Methods, and Recent Results

no code implementations • 13 Dec 2022 • Sergey Levine, Dhruv Shah

Navigation is one of the most heavily studied problems in robotics, and is conventionally approached as a geometric mapping and planning problem.

Paper
Add Code

RT-1: Robotics Transformer for Real-World Control at Scale

1 code implementation • 13 Dec 2022 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance.

1,220

Paper
Code

Confidence-Conditioned Value Functions for Offline Reinforcement Learning

no code implementations • 8 Dec 2022 • Joey Hong, Aviral Kumar, Sergey Levine

This approach can be implemented in practice by conditioning the Q-function from existing conservative algorithms on the confidence. We theoretically show that our learned value functions produce conservative estimates of the true value at any desired confidence.

Offline RL reinforcement-learning +1

Paper
Add Code

Multi-Task Imitation Learning for Linear Dynamical Systems

no code implementations • 1 Dec 2022 • Thomas T. Zhang, Katie Kang, Bruce D. Lee, Claire Tomlin, Sergey Levine, Stephen Tu, Nikolai Matni

In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class.

Imitation Learning Representation Learning

Paper
Add Code

Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes

no code implementations • 28 Nov 2022 • Aviral Kumar, Rishabh Agarwal, Xinyang Geng, George Tucker, Sergey Levine

The potential of offline reinforcement learning (RL) is that high-capacity models trained on large, heterogeneous datasets can lead to agents that generalize broadly, analogously to similar advances in vision and NLP.

Offline RL Q-Learning +2

Paper
Add Code

Data-Driven Offline Decision-Making via Invariant Representation Learning

no code implementations • 21 Nov 2022 • Han Qi, Yi Su, Aviral Kumar, Sergey Levine

The goal in offline data-driven decision-making is synthesize decisions that optimize a black-box utility function, using a previously-collected static dataset, with no active interaction.

Decision Making Domain Adaptation +2

Paper
Add Code

Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models

no code implementations • 21 Nov 2022 • Ted Xiao, Harris Chan, Pierre Sermanet, Ayzaan Wahid, Anthony Brohan, Karol Hausman, Sergey Levine, Jonathan Tompson

To accomplish this, we introduce Data-driven Instruction Augmentation for Language-conditioned control (DIAL): we utilize semi-supervised language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data and then train language-conditioned policies on the augmented datasets.

Imitation Learning

Paper
Add Code

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints

no code implementations • 2 Nov 2022 • Anikait Singh, Aviral Kumar, Quan Vuong, Yevgen Chebotar, Sergey Levine

Both theoretically and empirically, we show that typical offline RL methods, which are based on distribution constraints fail to learn from data with such non-uniform variability, due to the requirement to stay close to the behavior policy to the same extent across the state space.

Atari Games Offline RL +2

Paper
Add Code

Dual Generator Offline Reinforcement Learning

no code implementations • 2 Nov 2022 • Quan Vuong, Aviral Kumar, Sergey Levine, Yevgen Chebotar

In this paper, we show that the issue of conflicting objectives can be resolved by training two generators: one that maximizes return, with the other capturing the ``remainder'' of the data distribution in the offline dataset, such that the mixture of the two is close to the behavior policy.

Offline RL reinforcement-learning +1

Paper
Add Code

Adversarial Policies Beat Superhuman Go AIs

2 code implementations • 1 Nov 2022 • Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell

The core vulnerability uncovered by our attack persists even in KataGo agents adversarially trained to defend against our attack.

Paper
Code

Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision

no code implementations • 27 Oct 2022 • Ashvin Nair, Brian Zhu, Gokul Narayanan, Eugen Solowjow, Sergey Levine

One of the main observations we make in this work is that, with a suitable representation learning and domain generalization approach, it can be significantly easier for the reward function to generalize to a new but structurally similar task (e. g., inserting a new type of connector) than for the policy.

Domain Generalization Representation Learning

Paper
Add Code

Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models

no code implementations • 24 Oct 2022 • Hao liu, Xinyang Geng, Lisa Lee, Igor Mordatch, Sergey Levine, Sharan Narang, Pieter Abbeel

Large language models (LLM) trained using the next-token-prediction objective, such as GPT3 and PaLM, have revolutionized natural language processing in recent years by showing impressive zero-shot and few-shot capabilities across a wide range of tasks.

Language Modelling Natural Language Inference +1

Paper
Add Code

Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity

no code implementations • 18 Oct 2022 • Abhishek Gupta, Aldo Pacchiano, Yuexiang Zhai, Sham M. Kakade, Sergey Levine

Reinforcement learning provides an automated framework for learning behaviors from high-level reward specifications, but in practice the choice of reward function can be crucial for good results -- while in principle the reward only needs to specify what the task is, in reality practitioners often need to design more detailed rewards that provide the agent with some hints about how the task should be completed.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

You Only Live Once: Single-Life Reinforcement Learning

no code implementations • 17 Oct 2022 • Annie S. Chen, Archit Sharma, Sergey Levine, Chelsea Finn

We formalize this problem setting, which we call single-life reinforcement learning (SLRL), where an agent must complete a task within a single episode without interventions, utilizing its prior experience while contending with some form of novelty.

Continuous Control reinforcement-learning +1

Paper
Add Code

ExAug: Robot-Conditioned Navigation Policies via Geometric Experience Augmentation

no code implementations • 14 Oct 2022 • Noriaki Hirose, Dhruv Shah, Ajay Sridhar, Sergey Levine

Machine learning techniques rely on large and diverse datasets for generalization.

Paper
Add Code

Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks

no code implementations • 12 Oct 2022 • Kuan Fang, Patrick Yin, Ashvin Nair, Homer Walke, Gengchen Yan, Sergey Levine

The utilization of broad datasets has proven to be crucial for generalization for a wide range of fields.

Paper
Add Code

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

1 code implementation • 11 Oct 2022 • Aviral Kumar, Anikait Singh, Frederik Ebert, Mitsuhiko Nakamoto, Yanlai Yang, Chelsea Finn, Sergey Levine

To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot with as few as 10 task demonstrations, by effectively leveraging an existing dataset of diverse multi-task robot data collected in a variety of toy kitchens.

Offline RL Q-Learning +1

Paper
Code

GNM: A General Navigation Model to Drive Any Robot

1 code implementation • 7 Oct 2022 • Dhruv Shah, Ajay Sridhar, Arjun Bhorkar, Noriaki Hirose, Sergey Levine

Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data.

233

Paper
Code

Distributionally Adaptive Meta Reinforcement Learning

no code implementations • 6 Oct 2022 • Anurag Ajay, Abhishek Gupta, Dibya Ghosh, Sergey Levine, Pulkit Agrawal

In this work, we develop a framework for meta-RL algorithms that are able to behave appropriately under test-time distribution shifts in the space of tasks.

Meta Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective

no code implementations • 18 Sep 2022 • Raj Ghugare, Homanga Bharadhwaj, Benjamin Eysenbach, Sergey Levine, Ruslan Salakhutdinov

In this work, we propose a single objective which jointly optimizes a latent-space model and policy to achieve high returns while remaining self-consistent.

Reinforcement Learning (RL) Value prediction

Paper
Add Code

GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

1 code implementation • 12 Sep 2022 • Gilbert Feng, Hongbo Zhang, Zhongyu Li, Xue Bin Peng, Bhuvan Basireddy, Linzhu Yue, Zhitao Song, Lizhi Yang, Yunhui Liu, Koushil Sreenath, Sergey Levine

In this work, we introduce a framework for training generalized locomotion (GenLoco) controllers for quadrupedal robots.

218

Paper
Code

A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning

1 code implementation • 16 Aug 2022 • Laura Smith, Ilya Kostrikov, Sergey Levine

Deep reinforcement learning is a promising approach to learning policies in uncontrolled environments that do not require domain knowledge.

reinforcement-learning Reinforcement Learning (RL)

237

Paper
Code

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

1 code implementation • 9 Aug 2022 • Marwa Abdulhai, Natasha Jaques, Sergey Levine

IRL can provide a generalizable and compact representation for apprenticeship learning, and enable accurately inferring the preferences of a human in order to assist them.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Hierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal Robot

no code implementations • 1 Aug 2022 • Yandong Ji, Zhongyu Li, Yinan Sun, Xue Bin Peng, Sergey Levine, Glen Berseth, Koushil Sreenath

Developing algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task.

Friction Hierarchical Reinforcement Learning +3

Paper
Add Code

Inner Monologue: Embodied Reasoning through Planning with Language Models

no code implementations • 12 Jul 2022 • Wenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian Ichter

We investigate a variety of sources of feedback, such as success detection, scene description, and human interaction.

Paper
Add Code

Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning

no code implementations • 11 Jul 2022 • Homer Walke, Jonathan Yang, Albert Yu, Aviral Kumar, Jedrzej Orbik, Avi Singh, Sergey Levine

Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action

1 code implementation • 10 Jul 2022 • Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine

Goal-conditioned policies for robotic navigation can be trained on large, unannotated datasets, providing for good generalization to real-world settings.

Instruction Following Language Modelling

175

Paper
Code

Offline RL Policies Should be Trained to be Adaptive

no code implementations • 5 Jul 2022 • Dibya Ghosh, Anurag Ajay, Pulkit Agrawal, Sergey Levine

Offline RL algorithms must account for the fact that the dataset they are provided may leave many facets of the environment unknown.

Offline RL

Paper
Add Code

Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation

1 code implementation • 2 Jul 2022 • Michael Chang, Thomas L. Griffiths, Sergey Levine

Iterative refinement -- start with a random guess, then iteratively improve the guess -- is a useful paradigm for representation learning because it offers a way to break symmetries among equally plausible explanations for the data.

Representation Learning

Paper
Code

Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

no code implementations • 21 Jun 2022 • Katie Kang, Paula Gradu, Jason Choi, Michael Janner, Claire Tomlin, Sergey Levine

Learned models and policies can generalize effectively when evaluated within the distribution of the training data, but can produce unpredictable and erroneous outputs on out-of-distribution inputs.

Density Estimation

Paper
Add Code

Contrastive Learning as Goal-Conditioned Reinforcement Learning

no code implementations • 15 Jun 2022 • Benjamin Eysenbach, Tianjun Zhang, Ruslan Salakhutdinov, Sergey Levine

While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable and instead equip RL algorithms with additional representation learning parts (e. g., auxiliary losses, data augmentation).

Contrastive Learning Data Augmentation +4

Paper
Add Code

Imitating Past Successes can be Very Suboptimal

no code implementations • 7 Jun 2022 • Benjamin Eysenbach, Soumith Udatha, Sergey Levine, Ruslan Salakhutdinov

Prior work has proposed a simple strategy for reinforcement learning (RL): label experience with the outcomes achieved in that experience, and then imitate the relabeled experience.

Imitation Learning Reinforcement Learning (RL)

Paper
Add Code

Offline RL for Natural Language Generation with Implicit Language Q Learning

1 code implementation • 5 Jun 2022 • Charlie Snell, Ilya Kostrikov, Yi Su, Mengjiao Yang, Sergey Levine

Large language models distill broad knowledge from text corpora.

Language Modelling Offline RL +2

192

Paper
Code

Adversarial Unlearning: Reducing Confidence Along Adversarial Directions

no code implementations • 3 Jun 2022 • Amrith Setlur, Benjamin Eysenbach, Virginia Smith, Sergey Levine

Supervised learning methods trained with maximum likelihood objectives often overfit on training data.

Data Augmentation

Paper
Add Code

Multimodal Masked Autoencoders Learn Transferable Representations

1 code implementation • 27 May 2022 • Xinyang Geng, Hao liu, Lisa Lee, Dale Schuurmans, Sergey Levine, Pieter Abbeel

We provide an empirical study of M3AE trained on a large-scale image-text dataset, and find that M3AE is able to learn generalizable representations that transfer well to downstream tasks.

Contrastive Learning

Paper
Code

First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization

1 code implementation • 24 May 2022 • Siddharth Reddy, Sergey Levine, Anca D. Dragan

How can we train an assistive human-machine interface (e. g., an electromyography-based limb prosthesis) to translate a user's raw command signals into the actions of a robot or computer when there is no prior mapping, we cannot ask the user for supervision in the form of action labels or reward feedback, and we do not have prior knowledge of the tasks the user is trying to accomplish?

Paper
Code

Planning with Diffusion for Flexible Behavior Synthesis

2 code implementations • 20 May 2022 • Michael Janner, Yilun Du, Joshua B. Tenenbaum, Sergey Levine

Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers.

Decision Making Denoising +2

2,667

Paper
Code

Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space

no code implementations • 17 May 2022 • Kuan Fang, Patrick Yin, Ashvin Nair, Sergey Levine

Our experimental results show that PTP can generate feasible sequences of subgoals that enable the policy to efficiently solve the target tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters

no code implementations • 4 May 2022 • Xue Bin Peng, Yunrong Guo, Lina Halper, Sergey Levine, Sanja Fidler

By leveraging a massively parallel GPU-based simulator, we are able to train skill embeddings using over a decade of simulated experiences, enabling our model to learn a rich and versatile repertoire of skills.

Imitation Learning Unsupervised Reinforcement Learning

Paper
Add Code

Control-Aware Prediction Objectives for Autonomous Driving

no code implementations • 28 Apr 2022 • Rowan Mcallister, Blake Wulfe, Jean Mercat, Logan Ellis, Sergey Levine, Adrien Gaidon

Autonomous vehicle software is typically structured as a modular pipeline of individual components (e. g., perception, prediction, and planning) to help separate concerns into interpretable sub-tasks.

Autonomous Driving Trajectory Prediction

Paper
Add Code

Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning

no code implementations • 27 Apr 2022 • Philippe Hansen-Estruch, Amy Zhang, Ashvin Nair, Patrick Yin, Sergey Levine

We learn this representation using a metric form of this abstraction, and show its ability to generalize to new goals in simulation manipulation tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

2 code implementations • NAACL 2022 • Siddharth Verma, Justin Fu, Mengjiao Yang, Sergey Levine

Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties.

Chatbot Offline RL +2

Paper
Code

Context-Aware Language Modeling for Goal-Oriented Dialogue Systems

no code implementations • Findings (NAACL) 2022 • Charlie Snell, Mengjiao Yang, Justin Fu, Yi Su, Sergey Levine

Goal-oriented dialogue systems face a trade-off between fluent language generation and task-specific control.

Goal-Oriented Dialogue Systems Language Modelling +2

Paper
Add Code

INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL

no code implementations • ICLR 2022 • Homanga Bharadhwaj, Mohammad Babaeizadeh, Dumitru Erhan, Sergey Levine

We propose a modified objective for model-based RL that, in combination with mutual information maximization, allows us to learn representations and dynamics for visual model-based RL without reconstruction in a way that explicitly prioritizes functionally relevant factors.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Paper
Add Code

When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?

no code implementations • 12 Apr 2022 • Aviral Kumar, Joey Hong, Anikait Singh, Sergey Levine

To answer this question, we characterize the properties of environments that allow offline RL methods to perform better than BC methods, even when only provided with expert data.

Atari Games Imitation Learning +3

Paper
Add Code

Jump-Start Reinforcement Learning

1 code implementation • 5 Apr 2022 • Ikechukwu Uchendu, Ted Xiao, Yao Lu, Banghua Zhu, Mengyuan Yan, Joséphine Simon, Matthew Bennice, Chuyuan Fu, Cong Ma, Jiantao Jiao, Sergey Levine, Karol Hausman

In addition, we provide an upper bound on the sample complexity of JSRL and show that with the help of a guide-policy, one can improve the sample complexity for non-optimism exploration methods from exponential in horizon to polynomial.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

3 code implementations • 4 Apr 2022 • Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Chuyuan Fu, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan Yan, Andy Zeng

We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions, while value functions associated with these skills provide the grounding necessary to connect this knowledge to a particular physical environment.

Decision Making Language Modelling +1

171

Paper
Code

X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback

no code implementations • 4 Mar 2022 • Jensen Gao, Siddharth Reddy, Glen Berseth, Nicholas Hardy, Nikhilesh Natraj, Karunesh Ganguly, Anca D. Dragan, Sergey Levine

In the typing domain, we leverage backspaces as feedback that the interface did not perform the desired action.

Brain Computer Interface

Paper
Add Code

ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints

no code implementations • 23 Feb 2022 • Dhruv Shah, Sergey Levine

In this work, we propose an approach that integrates learning and planning, and can utilize side information such as schematic roadmaps, satellite maps and GPS coordinates as a planning heuristic, without relying on them being accurate.

3D Reconstruction General Knowledge +1

Paper
Add Code

Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization

3 code implementations • 17 Feb 2022 • Brandon Trabucco, Xinyang Geng, Aviral Kumar, Sergey Levine

To address this, we present Design-Bench, a benchmark for offline MBO with a unified evaluation protocol and reference implementations of recent methods.

Paper
Code

ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning

no code implementations • 5 Feb 2022 • Sean Chen, Jensen Gao, Siddharth Reddy, Glen Berseth, Anca D. Dragan, Sergey Levine

Building assistive interfaces for controlling robots through arbitrary, high-dimensional, noisy inputs (e. g., webcam images of eye gaze) can be challenging, especially when it involves inferring the user's desired action in the absence of a natural 'default' interface.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning

no code implementations • 4 Feb 2022 • Eric Jang, Alex Irpan, Mohi Khansari, Daniel Kappler, Frederik Ebert, Corey Lynch, Sergey Levine, Chelsea Finn

In this paper, we study the problem of enabling a vision-based robotic manipulation system to generalize to novel tasks, a long-standing challenge in robot learning.

Imitation Learning

Paper
Add Code

How to Leverage Unlabeled Data in Offline Reinforcement Learning

no code implementations • 3 Feb 2022 • Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Karol Hausman, Chelsea Finn, Sergey Levine

One natural solution is to learn a reward function from the labeled data and use it to label the unlabeled data.

Offline RL reinforcement-learning +1

Paper
Add Code

Fully Online Meta-Learning Without Task Boundaries

no code implementations • 1 Feb 2022 • Jathushan Rajasegaran, Chelsea Finn, Sergey Levine

In this paper, we study how meta-learning can be applied to tackle online problems of this nature, simultaneously adapting to changing tasks and input distributions and meta-training the model in order to adapt more quickly in the future.

Meta-Learning

Paper
Add Code

RvS: What is Essential for Offline RL via Supervised Learning?

1 code implementation • 20 Dec 2021 • Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.

Offline RL

Paper
Code

Autonomous Reinforcement Learning: Formalism and Benchmarking

2 code implementations • ICLR 2022 • Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn

In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials.

Benchmarking reinforcement-learning +1

Paper
Code

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization

no code implementations • ICLR 2022 • Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron Courville, George Tucker, Sergey Levine

In this paper, we discuss how the implicit regularization effect of SGD seen in supervised learning could in fact be harmful in the offline deep RL setting, leading to poor generalization and degenerate feature representations.

Atari Games D4RL +3

Paper
Add Code

Extending the WILDS Benchmark for Unsupervised Adaptation

1 code implementation • ICLR 2022 • Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang

Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well.

Paper
Code

CoMPS: Continual Meta Policy Search

no code implementations • ICLR 2022 • Glen Berseth, Zhiwei Zhang, Grace Zhang, Chelsea Finn, Sergey Levine

Beyond simply transferring past experience to new tasks, our goal is to devise continual reinforcement learning algorithms that learn to learn, using their experience on previous tasks to learn new tasks more quickly.

Continual Learning Continuous Control +5

Paper
Add Code

Information is Power: Intrinsic Control via Information Capture

no code implementations • NeurIPS 2021 • Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D. Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine

We study this question in dynamic partially-observed environments, and argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.

Paper
Add Code

Bayesian Adaptation for Covariate Shift

no code implementations • NeurIPS 2021 • Aurick Zhou, Sergey Levine

When faced with distribution shift at test time, deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. While improving the robustness of neural networks is one promising approach to mitigate this issue, an appealing alternate to robustifying networks against all possible test-time shifts is to instead directly adapt them to unlabeled inputs from the particular distribution shift we encounter at test time. However, this poses a challenging question: in the standard Bayesian model for supervised learning, unlabeled inputs are conditionally independent of model parameters when the labels are unobserved, so what can unlabeled data tell us about the model parameters at test-time?

Domain Adaptation Image Classification

Paper
Add Code

Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning

no code implementations • ICLR 2022 • Dhruv Shah, Peng Xu, Yao Lu, Ted Xiao, Alexander Toshev, Sergey Levine, Brian Ichter

Hierarchical reinforcement learning aims to enable this by providing a bank of low-level skills as action abstractions.

Hierarchical Reinforcement Learning reinforcement-learning +2

Paper
Add Code

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

1 code implementation • ICLR 2022 • Mengjiao Yang, Sergey Levine, Ofir Nachum

In this work, we answer this question affirmatively and present training objectives that use offline datasets to learn a factored transition model whose structure enables the extraction of a latent action space.

Imitation Learning

33,184

Paper
Code

Understanding the World Through Action

1 code implementation • 24 Oct 2021 • Sergey Levine

The recent history of machine learning research has taught us that machine learning methods can be most effective when they are provided with very large, high-capacity models, and trained on very large and diverse datasets.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks

no code implementations • ICLR 2022 • Tianjun Zhang, Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine, Joseph E. Gonzalez

Goal-conditioned reinforcement learning (RL) can solve tasks in a wide range of domains, including navigation and manipulation, but learning to reach distant goals remains a central challenge to the field.

Reinforcement Learning (RL)

Paper
Add Code

Data-Driven Offline Optimization For Architecting Hardware Accelerators

1 code implementation • ICLR 2022 • Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine

An alternative paradigm is to use a "data-driven", offline approach that utilizes logged simulation data, to architect hardware accelerators, without needing any form of simulations.

Computer Architecture and Systems

33,187

Paper
Code

MEMO: Test Time Robustness via Adaptation and Augmentation

2 code implementations • 18 Oct 2021 • Marvin Zhang, Sergey Levine, Chelsea Finn

We study the problem of test time robustification, i. e., using the test input to improve model robustness.

Test-time Adaptation

Paper
Code

Offline Reinforcement Learning with Implicit Q-Learning

15 code implementations • 12 Oct 2021 • Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

D4RL Offline RL +3

2,667

Paper
Code

Mismatched No More: Joint Model-Policy Optimization for Model-Based RL

1 code implementation • 6 Oct 2021 • Benjamin Eysenbach, Alexander Khazatsky, Sergey Levine, Ruslan Salakhutdinov

Many model-based reinforcement learning (RL) methods follow a similar template: fit a model to previously observed data, and then use data from that model for RL or planning.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Paper
Code

The Information Geometry of Unsupervised Reinforcement Learning

1 code implementation • ICLR 2022 • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

In this work, we show that unsupervised skill discovery algorithms based on mutual information maximization do not learn skills that are optimal for every possible reward function.

Contrastive Learning reinforcement-learning +3

Paper
Code

Test Time Robustification of Deep Models via Adaptation and Augmentation

no code implementations • 29 Sep 2021 • Marvin Mengxin Zhang, Sergey Levine, Chelsea Finn

We study the problem of test time robustification, i. e., using the test input to improve model robustness.

Test-time Adaptation

Paper
Add Code

Should I Run Offline Reinforcement Learning or Behavioral Cloning?

no code implementations • ICLR 2022 • Aviral Kumar, Joey Hong, Anikait Singh, Sergey Levine

In this paper, our goal is to characterize environments and dataset compositions where offline RL leads to better performance than BC.

Atari Games Offline RL +3

Paper
Add Code

Offline Reinforcement Learning with In-sample Q-Learning

1 code implementation • ICLR 2022 • Ilya Kostrikov, Ashvin Nair, Sergey Levine

D4RL Offline RL +3

Paper
Code

The Essential Elements of Offline RL via Supervised Learning

no code implementations • ICLR 2022 • Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

These methods, which we collectively refer to as reinforcement learning via supervised learning (RvS), involve a number of design decisions, such as policy architectures and how the conditioning variable is constructed.

Offline RL reinforcement-learning +1

Paper
Add Code

FitVid: High-Capacity Pixel-Level Video Prediction

no code implementations • 29 Sep 2021 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Suraj Nair, Sergey Levine, Chelsea Finn, Dumitru Erhan

Furthermore, such an agent can internally represent the complex dynamics of the real-world and therefore can acquire a representation useful for a variety of visual perception tasks.

Image Augmentation Video Prediction +1

Paper
Add Code

Data Sharing without Rewards in Multi-Task Offline Reinforcement Learning

no code implementations • 29 Sep 2021 • Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Chelsea Finn, Sergey Levine, Karol Hausman

However, these benefits come at a cost -- for data to be shared between tasks, each transition must be annotated with reward labels corresponding to other tasks.

Multi-Task Learning Offline RL +2

Paper
Add Code

Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets

2 code implementations • 27 Sep 2021 • Frederik Ebert, Yanlai Yang, Karl Schmeckpeper, Bernadette Bucher, Georgios Georgakis, Kostas Daniilidis, Chelsea Finn, Sergey Levine

Robot learning holds the promise of learning policies that generalize broadly.

Domain Generalization

Paper
Code

Training on Test Data with Bayesian Adaptation for Covariate Shift

no code implementations • 27 Sep 2021 • Aurick Zhou, Sergey Levine

When faced with distribution shift at test time, deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.

Domain Adaptation Image Classification

Paper
Add Code

A Workflow for Offline Model-Free Robotic Reinforcement Learning

1 code implementation • 22 Sep 2021 • Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine

To this end, we devise a set of metrics and conditions that can be tracked over the course of offline training, and can inform the practitioner about how the algorithm and model architecture should be adjusted to improve final performance.

Offline RL reinforcement-learning +1

Paper
Code

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning

no code implementations • NeurIPS 2021 • Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Karol Hausman, Sergey Levine, Chelsea Finn

We argue that a natural use case of offline RL is in settings where we can pool large amounts of data collected in various scenarios for solving different tasks, and utilize all of this data to learn behaviors for all the tasks more effectively rather than training each one in isolation.

Offline RL reinforcement-learning +1

Paper
Add Code

Robust Predictable Control

1 code implementation • NeurIPS 2021 • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

Many of the challenges facing today's reinforcement learning (RL) algorithms, such as robustness, generalization, transfer, and computational efficiency are closely related to compression.

Computational Efficiency Decision Making +1

2,426

Paper
Code

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation

no code implementations • 28 Jul 2021 • Charles Sun, Jędrzej Orbik, Coline Devin, Brian Yang, Abhishek Gupta, Glen Berseth, Sergey Levine

Our aim is to devise a robotic reinforcement learning system for learning navigation and manipulation together, in an autonomous way without human intervention, enabling continual learning under realistic assumptions.

Continual Learning Navigate +2

Paper
Add Code

Autonomous Reinforcement Learning via Subgoal Curricula

no code implementations • NeurIPS 2021 • Archit Sharma, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

no code implementations • 15 Jul 2021 • Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we show that an uncertainty aware classifier can solve challenging reinforcement learning problems by both encouraging exploration and provided directed guidance towards positive outcomes.

Meta-Learning reinforcement-learning +1

Paper
Add Code

Conservative Objective Models for Effective Offline Model-Based Optimization

2 code implementations • 14 Jul 2021 • Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine

Computational design problems arise in a number of settings, from synthetic biology to computer architectures.

Paper
Code

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

no code implementations • NeurIPS 2021 • Dibya Ghosh, Jad Rahme, Aviral Kumar, Amy Zhang, Ryan P. Adams, Sergey Levine

Generalization is a central challenge for the deployment of reinforcement learning (RL) systems in the real world.

Reinforcement Learning (RL)

Paper
Add Code

Explore and Control with Adversarial Surprise

1 code implementation • ICML Workshop URL 2021 • Arnaud Fickinger, Natasha Jaques, Samyak Parajuli, Michael Chang, Nicholas Rhinehart, Glen Berseth, Stuart Russell, Sergey Levine

Unsupervised reinforcement learning (RL) studies how to leverage environment statistics to learn useful behaviors without the cost of reward engineering.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Offline Meta-Reinforcement Learning with Online Self-Supervision

1 code implementation • 8 Jul 2021 • Vitchyr H. Pong, Ashvin Nair, Laura Smith, Catherine Huang, Sergey Levine

If we can meta-train on offline data, then we can reuse the same static dataset, labeled once with rewards for different tasks, to meta-train policies that adapt to a variety of new tasks at meta-test time.

Meta Reinforcement Learning Offline RL +2

Paper
Code

Pragmatic Image Compression for Human-in-the-Loop Decision-Making

1 code implementation • NeurIPS 2021 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

Standard lossy image compression algorithms aim to preserve an image's appearance, while minimizing the number of bits needed to transmit it.

Car Racing Decision Making +1

Paper
Code

Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment

no code implementations • ICLR Workshop Learning_to_Learn 2021 • Michael Chang, Sidhant Kaushik, Sergey Levine, Thomas L. Griffiths

Empirical evidence suggests that such action-value methods are more sample efficient than policy-gradient methods on transfer problems that require only sparse changes to a sequence of previously optimal decisions.

Decision Making Policy Gradient Methods +2

Paper
Add Code

FitVid: Overfitting in Pixel-Level Video Prediction

1 code implementation • 24 Jun 2021 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Suraj Nair, Sergey Levine, Chelsea Finn, Dumitru Erhan

There is a growing body of evidence that underfitting on the training data is one of the primary causes for the low quality predictions.

Ranked #6 on Video Generation on BAIR Robot Pushing

Image Augmentation Video Generation +1

Paper
Code

Hierarchically Integrated Models: Learning to Navigate from Heterogeneous Robots

no code implementations • 24 Jun 2021 • Katie Kang, Gregory Kahn, Sergey Levine

In this work, we propose a deep reinforcement learning algorithm with hierarchically integrated models (HInt).

Navigate reinforcement-learning +1

Paper
Add Code

Model-Based Reinforcement Learning via Latent-Space Collocation

1 code implementation • 24 Jun 2021 • Oleh Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine

The resulting latent collocation method (LatCo) optimizes trajectories of latent states, which improves over previously proposed shooting methods for visual model-based RL on tasks with sparse rewards and long-term goals.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

Which Mutual-Information Representation Learning Objectives are Sufficient for Control?

no code implementations • NeurIPS 2021 • Kate Rakelly, Abhishek Gupta, Carlos Florensa, Sergey Levine

Mutual information maximization provides an appealing formalism for learning representations of data.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Reinforcement Learning as One Big Sequence Modeling Problem

1 code implementation • ICML Workshop URL 2021 • Michael Janner, Qiyang Li, Sergey Levine

However, we can also view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.

Imitation Learning Offline RL +2

437

Paper
Code

Intrinsic Control of Variational Beliefs in Dynamic Partially-Observed Visual Environments

no code implementations • ICML Workshop URL 2021 • Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine

Paper
Add Code

Offline Reinforcement Learning as One Big Sequence Modeling Problem

2 code implementations • NeurIPS 2021 • Michael Janner, Qiyang Li, Sergey Levine

Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models, leveraging the Markov property to factorize problems in time.

Imitation Learning Offline RL +2

437

Paper
Code

Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning

no code implementations • 2 Jun 2021 • Jongwook Choi, Archit Sharma, Honglak Lee, Sergey Levine, Shixiang Shane Gu

Learning to reach goal states and learning diverse skills through mutual information (MI) maximization have been proposed as principled frameworks for self-supervised reinforcement learning, allowing agents to acquire broadly applicable multitask policies with minimal reward engineering.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

What Can I Do Here? Learning New Skills by Imagining Visual Affordances

2 code implementations • 1 Jun 2021 • Alexander Khazatsky, Ashvin Nair, Daniel Jing, Sergey Levine

In effect, prior data is used to learn what kinds of outcomes may be possible, such that when the robot encounters an unfamiliar setting, it can sample potential outcomes from its model, attempt to reach them, and thereby update both its skills and its outcome model.

Zero-shot Generalization

Paper
Code

DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies

no code implementations • 23 Apr 2021 • Soroush Nasiriany, Vitchyr H. Pong, Ashvin Nair, Alexander Khazatsky, Glen Berseth, Sergey Levine

Contextual policies provide this capability in principle, but the representation of the context determines the degree of generalization and expressivity.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention

no code implementations • 22 Apr 2021 • Abhishek Gupta, Justin Yu, Tony Z. Zhao, Vikash Kumar, Aaron Rovinsky, Kelvin Xu, Thomas Devlin, Sergey Levine

This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.

Multi-Task Learning Reinforcement Learning (RL)

Paper
Add Code

Contingencies from Observations: Tractable Contingency Planning with Learned Behavior Models

1 code implementation • 21 Apr 2021 • Nicholas Rhinehart, Jeff He, Charles Packer, Matthew A. Wright, Rowan Mcallister, Joseph E. Gonzalez, Sergey Levine

Humans have a remarkable ability to make decisions by accurately reasoning about future events, including the future behaviors and states of mind of other agents.

Paper
Code

Outcome-Driven Reinforcement Learning via Variational Inference

no code implementations • NeurIPS 2021 • Tim G. J. Rudner, Vitchyr H. Pong, Rowan Mcallister, Yarin Gal, Sergey Levine

While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the task, but also provide sufficient shaping to accomplish it.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale

no code implementations • 16 Apr 2021 • Dmitry Kalashnikov, Jacob Varley, Yevgen Chebotar, Benjamin Swanson, Rico Jonschkowski, Chelsea Finn, Sergey Levine, Karol Hausman

In this paper, we study how a large-scale collective robotic learning system can acquire a repertoire of behaviors simultaneously, sharing exploration, experience, and representations across tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills

no code implementations • 15 Apr 2021 • Yevgen Chebotar, Karol Hausman, Yao Lu, Ted Xiao, Dmitry Kalashnikov, Jake Varley, Alex Irpan, Benjamin Eysenbach, Ryan Julian, Chelsea Finn, Sergey Levine

We consider the problem of learning useful robotic skills from previously collected offline data without access to manually specified rewards or additional online exploration, a setting that is becoming increasingly important for scaling robot learning by reusing past robotic data.

Q-Learning reinforcement-learning +1

Paper
Add Code

Rapid Exploration for Open-World Navigation with Latent Goal Models

no code implementations • 12 Apr 2021 • Dhruv Shah, Benjamin Eysenbach, Gregory Kahn, Nicholas Rhinehart, Sergey Levine

We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.

Autonomous Navigation

Paper
Add Code

AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control

3 code implementations • 5 Apr 2021 • Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, Angjoo Kanazawa

Our system produces high-quality motions that are comparable to those achieved by state-of-the-art tracking-based techniques, while also being able to easily accommodate large datasets of unstructured motion clips.

Imitation Learning Reinforcement Learning (RL)

1,720

Paper
Code

Benchmarks for Deep Off-Policy Evaluation

3 code implementations • ICLR 2021 • Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine

Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making.

Benchmarking Continuous Control +3

Paper
Code

Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

no code implementations • 26 Mar 2021 • Zhongyu Li, Xuxin Cheng, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Developing robust walking controllers for bipedal robots is a challenging endeavor.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification

1 code implementation • NeurIPS 2021 • Benjamin Eysenbach, Sergey Levine, Ruslan Salakhutdinov

Can we devise RL algorithms that instead enable users to specify tasks simply by providing examples of successful outcomes?

General Classification Reinforcement Learning (RL)

33,187

Paper
Code

Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning

1 code implementation • 23 Mar 2021 • Hiroki Furuta, Tatsuya Matsushima, Tadashi Kozuno, Yutaka Matsuo, Sergey Levine, Ofir Nachum, Shixiang Shane Gu

Progress in deep reinforcement learning (RL) research is largely enabled by benchmark task environments.

Continuous Control OpenAI Gym +2

Paper
Code

Accelerating Online Reinforcement Learning via Model-Based Meta-Learning

no code implementations • ICLR Workshop Learning_to_Learn 2021 • John D Co-Reyes, Sarah Feng, Glen Berseth, Jie Qui, Sergey Levine

Current reinforcement learning algorithms struggle to quickly adapt to new situations without large amounts of experience and usually without large amounts of optimization over that experience.

Meta-Learning reinforcement-learning +1

Paper
Add Code

Maximum Entropy RL (Provably) Solves Some Robust RL Problems

no code implementations • ICLR 2022 • Benjamin Eysenbach, Sergey Levine

Many potential applications of reinforcement learning (RL) require guarantees that the agent will perform well in the face of disturbances to the dynamics or reward function.

Reinforcement Learning (RL)

Paper
Add Code

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

1 code implementation • 24 Feb 2021 • Angelos Filos, Clare Lyle, Yarin Gal, Sergey Levine, Natasha Jaques, Gregory Farquhar

This allows us to disentangle shared features and dynamics of the environment from agent-specific rewards and policies.

Autonomous Driving reinforcement-learning +1

Paper
Code

COMBO: Conservative Offline Model-Based Policy Optimization

4 code implementations • NeurIPS 2021 • Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn

We overcome this limitation by developing a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-action tuples generated via rollouts under the learned model.

Offline RL Uncertainty Quantification

241

Paper
Code

Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation

no code implementations • ICLR 2021 • Justin Fu, Sergey Levine

We propose to tackle this problem by leveraging the normalized maximum-likelihood (NML) estimator, which provides a principled approach to handling uncertainty and out-of-distribution inputs.

Paper
Add Code

How to Train Your Robot with Deep Reinforcement Learning; Lessons We've Learned

no code implementations • 4 Feb 2021 • Julian Ibarz, Jie Tan, Chelsea Finn, Mrinal Kalakrishnan, Peter Pastor, Sergey Levine

Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Evolving Reinforcement Learning Algorithms

5 code implementations • ICLR 2021 • John D. Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Sergey Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust

Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm.

Atari Games Meta-Learning +2

Paper
Code

Reinforcement Learning with Bayesian Classifiers: Efficient Skill Learning from Outcome Examples

no code implementations • 1 Jan 2021 • Kevin Li, Abhishek Gupta, Vitchyr H. Pong, Ashwin Reddy, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we study a more tractable class of reinforcement learning problems defined by data that provides examples of successful outcome states.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments

no code implementations • ICLR 2021 • Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Charles Blundell, Sergey Levine, Yoshua Bengio, Michael Curtis Mozer

To use a video game as an illustration, two enemies of the same type will share schemata but will have separate object files to encode their distinct state (e. g., health, position).

Object

Paper
Add Code

Variable-Shot Adaptation for Incremental Meta-Learning

no code implementations • 1 Jan 2021 • Tianhe Yu, Xinyang Geng, Chelsea Finn, Sergey Levine

Few-shot meta-learning methods consider the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.

Meta-Learning Zero-Shot Learning

Paper
Add Code

Invariant Representations for Reinforcement Learning without Reconstruction

no code implementations • ICLR 2021 • Amy Zhang, Rowan Thomas McAllister, Roberto Calandra, Yarin Gal, Sergey Levine

We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.

Causal Inference reinforcement-learning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.