Reinforcement Learning with TensorFlow
上QQ阅读APP看书,第一时间看更新

Understanding an OpenAI Gym environment

To understand the basics of importing Gym packages, loading an environment, and other important functions associated with OpenAI Gym, here's an example of a Frozen Lake environment.

Load the Frozen Lake environment in the following way:

import Gym 
env = Gym.make('FrozenLake-v0') #make function of Gym loads the specified environment

Next, we come to resetting the environment. While performing a reinforcement learning task, an agent undergoes learning through multiple episodes. As a result, at the start of each episode, the environment needs to be reset so that it comes to its initial situation and the agent begins from the start state. The following code shows the process for resetting an environment:

import Gym 
env = Gym.make('FrozenLake-v0')
s = env.reset() # resets the environment and returns the start state as a value
print(s)

-----------
0 #initial state is 0

After taking each action, there might be a requirement to show the status of the agent in the environment. Visualizing that status is done by:

env.render()

------------
SFFF FHFH FFFH HFFG

The preceding output shows that this is an environment with 4 x 4 grids, that is, 16 states arranged in the preceding manner where S, H, F, and G represents different forms of a state where:

  • S: Start block
  • F: Frozen block
  • H: Block has hole
  • G: Goal block

In newer versions of the Gym, the environment features can't be modified directly. This is done by unwrapping the environment parameters with:

env = env.unwrapped

Each environment is defined by the state spaces and action spaces for the agent to perform. The type (discrete or continuous) and size of state spaces and action spaces is very important to know in order to build a reinforcement learning agent:

print(env.action_space)
print(env.action_space.n)

----------------
Discrete(4)
4

The Discrete(4) output means that the action space of the Frozen Lake environment is a discrete set of values and has four distinct actions that can be performed by the agent.

print(env.observation_space)
print(env.observation_space.n)

----------------
Discrete(16)
16

The Discrete(16) output means that the observation (state) space of the Frozen Lake environment is a discrete set of values and has 16 different states to be explored by the agent.