Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

A platform for Applied Reinforcement Learning (Applied RL)

PDF Abstract ICML 2018 PDF ICML 2018 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Omniverse Isaac Gym AllegroHand SAC Average Return 296.49 # 2
Omniverse Isaac Gym Ant SAC Average Return 7717.93 # 2
OpenAI Gym Ant-v4 SAC Average Return 5208.09 # 3
Omniverse Isaac Gym Anymal SAC Average Return 11.87 # 2
Omniverse Isaac Gym FrankaCabinet SAC Average Return 1721.98 # 2
OpenAI Gym HalfCheetah-v4 SAC Average Return 15836.04 # 1
OpenAI Gym Hopper-v4 SAC Average Return 2882.56 # 3
Omniverse Isaac Gym Humanoid SAC Average Return 4028.31 # 2
OpenAI Gym Humanoid-v4 SAC Average Return 6211.50 # 2
Omniverse Isaac Gym Ingenuity SAC Average Return 5301.99 # 1
Continuous Control Lunar Lander (OpenAI Gym) SAC Score 284.59±0.97 # 1
OpenAI Gym Walker2d-v4 SAC Average Return 5745.27 # 1

Methods