Environment (Env)¶

Env¶

The parent class for environments.

class perls2.envs.env.Env(config, use_visualizer=False, name=None)¶

Abstract base class for environments.

config¶

A dict containing parameters to create an arena, robot interface, sensor interface and object interface. They also contain specs for learning, simulation and experiment setup.

Type: dict

arena¶

Manages the sim by loading models in both sim/real envs. and for simulations, randomizing objects and sensors parameters.

Type: Arena

robot_interface¶

Communicates with robots and executes robot commands.

Type: perls2.RobotInterface

sensor_interface¶

Retrieves sensor info and executes changes to extrinsic/intrinsic params.

Type: perls2.SensorInterface

object_interfaces¶

Dictionary of ObjectInterfaces

Type: dict

action_space¶

get_observation()¶

Get observation of current env state.

Returns: An observation, typically a dict with multiple things

handle_exception(e)¶: Handle an exception.

property info¶

Return dictionary with env info

This may include name, number of steps, whether episode was successful, or other useful information for the agent.

initialize()¶

Create attributes for environment.

This function creates the following attributes:

Arena
RobotInterface
SensorInterface
ObjectInterface(s) (if applicable)
observation_space
action_space
various counters.

This is a public function as sometimes it is necessary to reinitialize an environment to fully reset a simulation.

Args: None Returns: None

is_done()¶: Public wrapper to check episode termination.

is_success()¶: Check if the task condition is reached.

render(mode='human', close=False)¶

Render the gym environment.

See OpenAI.gym reference.

Parameters

mode (str) – string indicating type of rendering mode.
close (bool) – open/closed rendering.

reset()¶

Reset the environment.

Returns: The observation.

rewardFunction()¶: Compute and return user-defined reward for agent given env state.

step(action, start=None)¶

Take a step.

Execute the action first, then step the world. Update the episodes / steps and determine termination state, compute the reward function and get the observation.

Parameters

action – The action to take.
start – timestamp (time.time()) taken before policy computes action. This helps enforce policy frequency.

Returns

Observation, reward, termination, info for the environment: as a tuple.

visualize(observation, action)¶

Visualize the action - that is, add visual markers to the world (in case of sim) or execute some movements (in case of real) to indicate the action about to be performed.

Parameters

observation – The observation of the current step.
action – The selected action.