smartstart.environments package

Submodules

smartstart.environments.environment module

Environment module

Contains base classes for constructing environments and visualizers

class Environment(name=None)

Bases: object

Base class for constructing environments

reset(start_state=None)

Reset the agent to the start state

Parameters:start_state (np.ndarray, optional) – Start state, if not supplied standard start state is used
step(action)

Take a step in the environment

Parameters:action (int, float or np.ndarray) – action
Raises:NotImplementedError – use a subclass of Environment like GridWorld.
visualizer

Visualizer – visualizer of the agent, value function, density and/or console.

Sets the name of the visualizer to be equal to the environments name

class Visualizer

Bases: object

CONSOLE = 0
DENSITY = 3
LIVE_AGENT = 1
VALUE_FUNCTION = 2
add_visualizer(*args)

Adds a window to the visualizer

Available windows can be found in the implemented class attributes

Parameters:args (list of int) – visualizers to add. Possible visualizers are defined in the class attributes
Raises:NotImplementedError
render(value_map=None, density_map=None, message=None, close=False)

Render the current state of the training algorithm

The render method should include all functionality to close the visualizer. When the close parameter is True the visualizer should be closed and send back to the learning algorithm to stop rendering.

Parameters:
  • value_map (np.ndarray, optional) – value function map
  • density_map (np.ndarray, optional) – density map
  • message (str: optional) – message to print to the console window
  • close (bool) – True if the visualizer has to be closed
Returns:

True if the agent has to keep rendering

Return type:

bool

Raises:

NotImplementedError

smartstart.environments.generate_gridworld module

Random GridWorld Generator module

Defines methods necessary for generating a random gridworld. The generated gridworld is a numpy array with dtype int where each entry is a state.

Attributes in the gridworld are defined with the following types:
  1. State
  2. Wall
  3. Start state
  4. Goal state

States will have a grid size of 3x3. States are separated by a 1x3 of 3x1 wall that can be turned in to a state.

generate_gridworld(size=None)

Generate a random GridWorld

Generates a random GridWorld using a depth-first search (see Wikipedia).

Parameters:size (tuple) – size of the GridWorld (default = (100, 100))
Returns:
  • str – name (= Random)
  • GridWorld – new GridWorld object
  • int – scale (= 1)

smartstart.environments.gridworld module

GridWorld module

class GridWorld(name, layout, T_prob=0.0, wall_reset=False, scale=5)

Bases: smartstart.environments.environment.Environment

GridWorld environment

Creates a GridWorld environment with the layout specified by the user. A layout is is double numpy array, where each entry is a state with a certain type. The possible types are:

  1. State
  2. Wall
  3. Start state
  4. Goal state

Example layout = [[3, 0, 0],[1, 1, 0],[1, 1, 0],[2, 0, 0]]

The agent can take discrete steps in 4 directions; up, right, down and left. When no transition probability is defined (T_prob=0) the environment will be deterministic (i.e. action up will bring the agent to state above the current state). When a transition probability is defined there is T_prob change the agent will choose a random action from the four actions.

When the agent tries to step through the wall it will count as a step but the agent will stay in the same state. When wall_reset is set to True the environment will return done to the learning algorithm.

A reward of 1. is given for reaching the goal state and the environment will return done=True to the learning algorithm.

Five different preset gridworlds are defined; EASY, MEDIUM, HARD, EXTREME and IMPOSSIBRUUHHH. The first three have a simple layout like, the EXTREME maze is a more complicated maze and the IMPOSSIBRUUHHH generates a random maze with a specified size.

Parameters:
  • name (str) – name for the environment, class name will be prefixed to this name
  • layout (double list of int or np.ndarray) – layout of the gridworld
  • T_prob (float) – transition probability
  • wall_reset (bool) – when True, the environment will return done when a wall is hit
  • scale (int) – scale factor, gridworld will become layout * scale in size
EASY = 0
EXTREME = 3
HARD = 2
IMPOSSIBRUUHHH = 4
MEDIUM = 1
close_render()

Close the visualizer

Returns:True if the visualizer was closed properly
Return type:bool
classmethod generate(type=0, size=None)

Generates a gridworld according to the presets

Preset are given in smartstart.environments.presets. Presets contain the name, layout and scale of the gridworld.

Parameters:
  • type (int) – gridworld type as defined in the class attributes (Default value = EASY)
  • size (tuple, optional) – used for generating a random gridworld when the type is IMPOSSIBRUUHH.
Returns:

new gridworld environment

Return type:

GridWorld

get_T_R()

Return transition model and reward function

Creates a the transition model and reward function for the gridworld and transition probability. Can be used by dynamic programming, like ValueIteration.

Returns:
  • collections.defaultdict – transition model
  • collections.defaultdict – reward function
get_all_states()

Return all the states of the gridworld

Returns:states
Return type:set
get_grid()

Returns copy of the gridworld

Returns:Copy of the gridworld
Return type:np.ndarray
possible_actions(state)

Returns the available actions for state

Note

State is not used in this environment, might be necessary for more complicated environments where states don’t have the same set actions.

Parameters:state (np.ndarray) – state
Returns:containing available actions for state
Return type:list of int
render(**kwargs)

Render the environment

If no visualizer is defined prints it to the console.

Parameters:**kwargs – See GridWorldVisualizer for details
Returns:True if the agent has to stop rendering
Return type:bool
reset(start_state=None)
Parameters:start_state (np.ndarray, optional) – (Default value = None)
Returns:state of the agent (start state)
Return type:np.ndarray
step(action)

Take a step in the environment

Executes action and retrieves the new state. Determines if any wall is hit and if so goes back to previous state.

Calculates the reward using the previous state, action and new state. Determines if the new state is terminal.

Parameters:action (int) – action
Returns:
  • np.ndarray – new state
  • float – reward
  • bool – True if new state is terminal
  • dict – Empty info dict (to make it equal to the step method in the OpenAI Gym)

smartstart.environments.gridworldvisualizer module

GridWorld Visualizer module

class GridWorldVisualizer(env, name='GridWorld', size=None, fps=60)

Bases: smartstart.environments.environment.Visualizer

Render the gridworld an optional extra visualizations. Available visualizers:

  1. console
  2. live agent
  3. value function
  4. density
Parameters:
  • env (GridWorld) – GridWorld environment
  • name (str) – visualizer name
  • size (tuple) – size of each window in the visualizer (Default = (450, 450))
  • fps (int) – frames per second for pygame
env

GridWorld – GridWorld environment

name

str – visualizer name

size

tuple – size of each window in the visualizer (Default = (450, 450))

fps

int – frames per second for pygame

screen

pygame.Surface – pygame surface for rendering

clock

pygame.Clock – pygame clock

grid

np.ndarray – GridWorld grid

colors

dict – colors for rendering the GridWorld

messages

collections.deque – deque with messages to show in the console window (maxlen=29)

active_visualizers

set – contain the visualizers to be used

add_visualizer(*args)

Add visualizer

Parameters:*args – visualizers to add, see class attributes for available visualizers
render(value_map=None, density_map=None, message=None, close=False)

Render the current state of the GridWorld

Parameters:
  • value_map (np.ndarray) – (Default value = None)
  • density_map (np.ndarray) – (Default value = None)
  • message (str) – (Default value = None)
  • close (bool) – (Default value = False)
Returns:

True if the agent has to stop rendering

Return type:

bool

smartstart.environments.presets module

Preset GridWorlds module

Are used by the GridWorld to generate the gridworld.

easy()

Easy GridWorld

Returns:
  • str – name
  • double list of int – layout
  • int – scale
extreme()

Extreme GridWorld

Returns:
  • str – name
  • double list of int – layout
  • int – scale
hard()

Hard GridWorld

Returns:
  • str – name
  • double list of int – layout
  • int – scale
medium()

Medium GridWorld

Returns:
  • str – name
  • double list of int – layout
  • int – scale

Module contents