Environment (Env)¶
Env¶
The parent class for environments.
-
class
perls2.envs.env.
Env
(config, use_visualizer=False, name=None)¶ Abstract base class for environments.
-
config
¶ A dict containing parameters to create an arena, robot interface, sensor interface and object interface. They also contain specs for learning, simulation and experiment setup.
- Type
dict
-
arena
¶ Manages the sim by loading models in both sim/real envs. and for simulations, randomizing objects and sensors parameters.
- Type
Arena
-
robot_interface
¶ Communicates with robots and executes robot commands.
- Type
perls2.RobotInterface
-
sensor_interface
¶ Retrieves sensor info and executes changes to extrinsic/intrinsic params.
- Type
perls2.SensorInterface
-
object_interfaces
¶ Dictionary of ObjectInterfaces
- Type
dict
-
action_space
¶
-
get_observation
()¶ Get observation of current env state.
- Returns
An observation, typically a dict with multiple things
-
handle_exception
(e)¶ Handle an exception.
-
property
info
¶ Return dictionary with env info
This may include name, number of steps, whether episode was successful, or other useful information for the agent.
-
initialize
()¶ Create attributes for environment.
- This function creates the following attributes:
Arena
RobotInterface
SensorInterface
ObjectInterface(s) (if applicable)
observation_space
action_space
various counters.
This is a public function as sometimes it is necessary to reinitialize an environment to fully reset a simulation.
Args: None Returns: None
-
is_done
()¶ Public wrapper to check episode termination.
-
is_success
()¶ Check if the task condition is reached.
-
render
(mode='human', close=False)¶ Render the gym environment.
See OpenAI.gym reference.
- Parameters
mode (str) – string indicating type of rendering mode.
close (bool) – open/closed rendering.
-
reset
()¶ Reset the environment.
- Returns
The observation.
-
rewardFunction
()¶ Compute and return user-defined reward for agent given env state.
-
step
(action, start=None)¶ Take a step.
Execute the action first, then step the world. Update the episodes / steps and determine termination state, compute the reward function and get the observation.
- Parameters
action – The action to take.
start – timestamp (time.time()) taken before policy computes action. This helps enforce policy frequency.
- Returns
- Observation, reward, termination, info for the environment
as a tuple.
-
visualize
(observation, action)¶ Visualize the action - that is, add visual markers to the world (in case of sim) or execute some movements (in case of real) to indicate the action about to be performed.
- Parameters
observation – The observation of the current step.
action – The selected action.
-