smartstart.environments package¶

Submodules¶

smartstart.environments.environment module¶

Environment module

Contains base classes for constructing environments and visualizers

class Environment(name=None)¶

Bases: object

Base class for constructing environments

reset(start_state=None)¶

Reset the agent to the start state

Parameters:	start_state (`np.ndarray`, optional) – Start state, if not supplied standard start state is used

step(action)¶

Take a step in the environment

Parameters:	action (`int`, `float` or `np.ndarray`) – action
Raises:	`NotImplementedError` – use a subclass of `Environment` like `GridWorld`.

visualizer¶

Visualizer – visualizer of the agent, value function, density and/or console.

Sets the name of the visualizer to be equal to the environments name

class Visualizer¶

Bases: object

CONSOLE = 0¶

DENSITY = 3¶

LIVE_AGENT = 1¶

VALUE_FUNCTION = 2¶

add_visualizer(*args)¶

Adds a window to the visualizer

Available windows can be found in the implemented class attributes

Parameters:	args (`list` of `int`) – visualizers to add. Possible visualizers are defined in the class attributes
Raises:	`NotImplementedError`

render(value_map=None, density_map=None, message=None, close=False)¶

Render the current state of the training algorithm

The render method should include all functionality to close the visualizer. When the close parameter is True the visualizer should be closed and send back to the learning algorithm to stop rendering.

Parameters:	value_map (`np.ndarray`, optional) – value function map density_map (`np.ndarray`, optional) – density map message (`str`: optional) – message to print to the console window close (`bool`) – True if the visualizer has to be closed
Returns:	True if the agent has to keep rendering
Return type:	`bool`
Raises:	`NotImplementedError`

smartstart.environments.generate_gridworld module¶

Random GridWorld Generator module

Defines methods necessary for generating a random gridworld. The generated gridworld is a numpy array with dtype int where each entry is a state.

Attributes in the gridworld are defined with the following types:

State
Wall
Start state
Goal state

States will have a grid size of 3x3. States are separated by a 1x3 of 3x1 wall that can be turned in to a state.

generate_gridworld(size=None)¶

Generate a random GridWorld

Generates a random GridWorld using a depth-first search (see Wikipedia).

Parameters:	size (`tuple`) – size of the GridWorld (default = (100, 100))
Returns:	`str` – name (= Random) `GridWorld` – new GridWorld object `int` – scale (= 1)

smartstart.environments.gridworld module¶

GridWorld module

class GridWorld(name, layout, T_prob=0.0, wall_reset=False, scale=5)¶

Bases: smartstart.environments.environment.Environment

GridWorld environment

Creates a GridWorld environment with the layout specified by the user. A layout is is double numpy array, where each entry is a state with a certain type. The possible types are:

State

Wall

Start state

Goal state

Example layout = [[3, 0, 0],[1, 1, 0],[1, 1, 0],[2, 0, 0]]

The agent can take discrete steps in 4 directions; up, right, down and left. When no transition probability is defined (T_prob=0) the environment will be deterministic (i.e. action up will bring the agent to state above the current state). When a transition probability is defined there is T_prob change the agent will choose a random action from the four actions.

When the agent tries to step through the wall it will count as a step but the agent will stay in the same state. When wall_reset is set to True the environment will return done to the learning algorithm.

A reward of 1. is given for reaching the goal state and the environment will return done=True to the learning algorithm.

Five different preset gridworlds are defined; EASY, MEDIUM, HARD, EXTREME and IMPOSSIBRUUHHH. The first three have a simple layout like, the EXTREME maze is a more complicated maze and the IMPOSSIBRUUHHH generates a random maze with a specified size.

Parameters:	name (`str`) – name for the environment, class name will be prefixed to this name layout (double `list` of int or np.ndarray) – layout of the gridworld T_prob (`float`) – transition probability wall_reset (`bool`) – when True, the environment will return done when a wall is hit scale (`int`) – scale factor, gridworld will become layout * scale in size

EASY = 0¶

EXTREME = 3¶

HARD = 2¶

IMPOSSIBRUUHHH = 4¶

MEDIUM = 1¶

close_render()¶

Close the visualizer

Returns:	True if the visualizer was closed properly
Return type:	`bool`

classmethod generate(type=0, size=None)¶

Generates a gridworld according to the presets

Preset are given in smartstart.environments.presets. Presets contain the name, layout and scale of the gridworld.

Parameters:	type (`int`) – gridworld type as defined in the class attributes (Default value = EASY) size (`tuple`, optional) – used for generating a random gridworld when the type is IMPOSSIBRUUHH.
Returns:	new gridworld environment
Return type:	`GridWorld`

get_T_R()¶

Return transition model and reward function

Creates a the transition model and reward function for the gridworld and transition probability. Can be used by dynamic programming, like ValueIteration.

Returns:	`collections.defaultdict` – transition model `collections.defaultdict` – reward function

get_all_states()¶

Return all the states of the gridworld

Returns:	states
Return type:	`set`

get_grid()¶

Returns copy of the gridworld

Returns:	Copy of the gridworld
Return type:	`np.ndarray`

possible_actions(state)¶

Returns the available actions for state

Note

State is not used in this environment, might be necessary for more complicated environments where states don’t have the same set actions.

Parameters:	state (`np.ndarray`) – state
Returns:	containing available actions for state
Return type:	`list` of `int`

render(**kwargs)¶

Render the environment

If no visualizer is defined prints it to the console.

Parameters:	**kwargs – See `GridWorldVisualizer` for details
Returns:	True if the agent has to stop rendering
Return type:	`bool`

reset(start_state=None)¶

Parameters:	start_state (`np.ndarray`, optional) – (Default value = None)
Returns:	state of the agent (start state)
Return type:	`np.ndarray`

step(action)¶

Take a step in the environment

Executes action and retrieves the new state. Determines if any wall is hit and if so goes back to previous state.

Calculates the reward using the previous state, action and new state. Determines if the new state is terminal.

Parameters:	action (`int`) – action
Returns:	`np.ndarray` – new state `float` – reward `bool` – True if new state is terminal `dict` – Empty info dict (to make it equal to the step method in the OpenAI Gym)

smartstart.environments.gridworldvisualizer module¶

GridWorld Visualizer module

class GridWorldVisualizer(env, name='GridWorld', size=None, fps=60)¶

Bases: smartstart.environments.environment.Visualizer

Render the gridworld an optional extra visualizations. Available visualizers:

console

live agent

value function

density

Parameters:	env (`GridWorld`) – GridWorld environment name (`str`) – visualizer name size (`tuple`) – size of each window in the visualizer (Default = (450, 450)) fps (`int`) – frames per second for pygame

env¶: GridWorld – GridWorld environment

name¶: str – visualizer name

size¶: tuple – size of each window in the visualizer (Default = (450, 450))

fps¶: int – frames per second for pygame

screen¶: pygame.Surface – pygame surface for rendering

clock¶: pygame.Clock – pygame clock

grid¶: np.ndarray – GridWorld grid

colors¶: dict – colors for rendering the GridWorld

messages¶: collections.deque – deque with messages to show in the console window (maxlen=29)

active_visualizers¶: set – contain the visualizers to be used

add_visualizer(*args)¶

Add visualizer

Parameters:	*args – visualizers to add, see class attributes for available visualizers

render(value_map=None, density_map=None, message=None, close=False)¶

Render the current state of the GridWorld

Parameters:	value_map (`np.ndarray`) – (Default value = None) density_map (`np.ndarray`) – (Default value = None) message (`str`) – (Default value = None) close (`bool`) – (Default value = False)
Returns:	True if the agent has to stop rendering
Return type:	`bool`

smartstart.environments.presets module¶

Preset GridWorlds module

Are used by the GridWorld to generate the gridworld.

easy()¶

Easy GridWorld

Returns:	`str` – name double `list` of int – layout `int` – scale

extreme()¶

Extreme GridWorld

Returns:	`str` – name double `list` of int – layout `int` – scale

hard()¶

Hard GridWorld

Returns:	`str` – name double `list` of int – layout `int` – scale

medium()¶

Medium GridWorld

Returns:	`str` – name double `list` of int – layout `int` – scale