／var／log marcus chiu

❯

❯

❯

❯

❯

❯

Policy Gradient Methods

Created on Jul 16, 2023 · Last Modified on Aug 24, 2024

are a type of reinforcement learning technique that relies upon optimizing parametrized policies with respect to the expected return (long-term cumulative reward) by gradient descent