上QQ阅读APP看书,第一时间看更新
Policy
The policy denoted by prescribes what action is to be taken given the state. It can be seen as a function that maps states to actions. There are two major types of policy: deterministic policies and stochastic policies.
A deterministic policy prescribes one action for a given state, that is, there is only one action, , given s. Mathematically, it means .
A stochastic policy prescribes an action distribution given a state at time step , that is, there are multiple actions with a probability value for each action. Mathematically, it means .
Agents following different policies may exhibit different behaviors in the same environment.