Deep Deterministic Policy Gradients
Deep Deterministic Policy Gradient (DDPG) is an off-policy, model-free, actor-critic algorithm and is based on the Deterministic Policy Gradient (DPG) theorem (proceedings.mlr.press/v32/silver14.pdf). Unlike the deep Q-learning-based methods, actor-critic policy gradient-based methods are easily applicable to continuous action spaces, in addition to problems/tasks with discrete action spaces.
Core concepts
In Chapter 8, Implementing an Intelligent Autonomous Car Driving Agent Using the Deep Actor-Critic algorithm, we walked you through the derivation of the policy gradient theorem and reproduced the following for bringing in context:

You may recall that the policy we considered was a stochastic function that assigned a probability to each action given the state (s) and the parameters (

). In deterministic policy gradients, the stochastic policy is replaced by a deterministic policy that prescribes a fixed policy for a given state and set of parameters

. In...