Hands-On Intelligent Agents with OpenAI Gym
上QQ阅读APP看书,第一时间看更新

Policy

The policy denoted by  prescribes what action is to be taken given the state. It can be seen as a function that maps states to actions. There are two major types of policy: deterministic policies and stochastic policies.

A deterministic policy prescribes one action for a given state, that is, there is only one action, , given s. Mathematically, it means .

A stochastic policy prescribes an action distribution given a state  at time step , that is, there are multiple actions with a probability value for each action. Mathematically, it means .

Agents following different policies may exhibit different behaviors in the same environment.