SAC

Posted on 2021-07-22 Edited on 2021-09-16 Views:

Symbols count in article: 227 Reading time ≈ 1 mins.

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Maximum entropy reinforcement learning optimizes policies to maximize both the expected return and the expected entropy of the policy.