The BEHAVIOR Dataset of Human Demonstrations in Virtual Reality
The BEHAVIOR dataset includes 500 demonstrations of human participants performing BEHAVIOR-100 activities in iGibson 2.0 using a virtual reality interface. The data was collected by 5 participants using an HTC Vive Pro Eye headset with 2 HTC Vive Wand controllers and 1 torso tracker strapped to the participant’s waist. The participants were cued by a VR overlay describing activity goal conditions. Activity relevant objects corresponding to a goal condition were highlighted on the user view on-demand, guiding human execution. The text of the corresponding part of the condition switched from red to green at the completion of a given part.
There are two available HDF5 datasets. The first is processed for imitation learning, containing rendered sensor modalities matching the observation of the agent, as well as the recorded action. The second is the “raw” unprocessed observations which include additional data from the VR recordings and simulator state of the activity-relevant objects.
Note: Since every small change in the simulator may cause deviating changes in execution, if you want to replay the demos (run the actions stored in the demonstration file, and reproduce the exact same outcomes, e.g. states/images), this comes with some caveats. You must:
Be on Windows
Use the
behavior-replay
branch of iGibsonClone the
behavior-replay
branch of bddl, uninstall any existing version of bddl by doingpip uninstall bddl
, then install the cloned version by runningpip install -e .
in the bddl directory.Download the behavior-replay version of the iGibson assets via this link, put it under
iGibson/igibson/data
, and rename it asassets
.Download the behavior-replay version of BEHAVIOR dataset via this link, put it under
iGibson/igibson/data
, and rename it asig_dataset
.
Following the instructions above, we have verified that the 500 demos can be replayed and reproduced perfectly.
Try command python -m igibson.examples.behavior.behavior_demo_replay --mode headless --vr_log_path xxx.hdf5
Virtual reality demonstration dataset (VR_demos)
We provide a processed dataset generated from the raw files and that includes images and proprioceptive sensing. Alternative processed datasets can be generated based on the raw files, e.g. with different image resolutions.
HDF5 Content
The following are the available keys to index into the hdf5 file. The dimensionality of each component is noted parenthetically, where N
indicates the number of frames in the demo.
action (N x 28) – see BehaviorRobot description in the Embodiments section for details about the actuation of this robot. This vector contains two additional dimensions that correspond to the
hand reset
action in VR: an action that teleports the simulated hands to the exact pose of the VR hand controller when they have diverged. These actions are not used by AI agents but are necessary to understand the demos.proprioception (N x 22) – proprioceptive feedback. More details in the Embodiments section.
rgb (N x 128 x 128 x 3) – rgb image from camera
depth (N x 128 x 128 x 1) – depth map
seg (N x 128 x 128 x 1) – segmentation of scene
ins_seg (N x 128 x 128 x 1) – instance segmentation
highlight ( N x 128 x 128 x 1) – activity relevant object binary mask, active for all objects included in the activity goal (except the agent and the floor)
task_obs (N x 456) – task observations, including ground truth state of the robot, and ground truth poses and grasping state of a maximum of a fixed number of activity relevant objects
Raw virtual reality demonstration dataset (VR_demos_raw)
Accessing the BEHAVIOR hdf5 demo files
This code snipped provides access to the data using the python’s hdf5 module that is dependency of iGibson:
import h5py
import numpy
frame_number = 0
hf = h5py.File(‘/path/to/behavior/demo.hdf5’)
# see below for metadata description
activity_name = hf.attrs[‘metadata/task_name’]
print(activity_name)
# see below for finding indices for array
position_of_hmd = np.array(hf[‘frame_data’][‘vr_device_data’][‘hmd’][frame_number, 1:4])
Metadata
The metadata can be accessed by keying into the hdf5.attrs with the following keys:
/metadata/start_time
: the date the demo was recorded/metadata/physics_timestep
: the simulated time duration of each step of the physics simulator (1/300 seconds for all our demos)/metadata/render_timestep
: the simulated time between each rendered image, determines the framerate (1/30 seconds).render_timestep / physics_timestep
gives the number of physics simulation steps between two generated images (10)/metadata/git_info
: the git info for activity definition,iGibson
,ig_dataset
, andig_assets
. This is used to ensure participants are using a compatible version of iGibson if replaying the demo/metadata/task_name
: The name of the activity, e.g.washing_dishes
,putting_away_groceries
…/metadata/task_instance
: The instance of the activity that specifies the state (pose, extended state) of the sampled activity relevant objects at initialization/metadata/scene_id
: The scene (Rs_int
,Wainscott_0_int
, etc.) where the activity was recorded/metadata/filter_objects
: Whether only activity relevant objects were recorded in the activity/metadata/obj_body_id_to_name
: mapping of pybullet IDs to the semantic object name
HDF5 Content
The following are the available keys to index into the hdf5 file. The dimensionality of each component is noted parenthetically, where N
indicates the number of frames in the demo.
frame_data
(N x 4)goal_status
satisfied
(N x total_goals) – Total satisfied top-level predicates, where total_goals is the number of predicatesunsatisfied
(N x total_goals) – Total unsatisfied top-level predicates, where total_goals is the number of predicates
physics_data
string
(bullet_id: total number of activity-relevant scene objects)position
(N x 3) – The 3D position of the object center of massorientation
(N x 4) – The quaternion orientation of the objectjoint_state
(N x number of object joints) – The pybullet joint state of each object
vr
vr_camera
right_eye_view
(N x 4 x 4) – the view projection matrixright_eye_proj
(N x 4 x 4) – the camera projection matrixright_camera_pos
(N x 3) – The 3D position of the camera
vr_device_data
hmd
(N x 17) – see belowleft_controller
(N x 27) – see belowright_controller
(N x 27) – see belowvr_position_data
(N x 12) – see belowtorso_tracker
(N x 8) – see below
vr_button_data
left_controller
(N x 3) – see belowright_controller
(N x 3) – see below
vr_eye_tracking_data
left_controller
(N x 9) – see below
vr_event_data
left_controller
(N x 28) – see belowright_controller
(N x 28) – see belowreset_actions
(N x 2) – reset for left and right controller
Agent_actions
vr_robot
(N x 28) – see BEHAVIOR robot description in previous section
action
– unused
Additional description of the dimensions of the arrays noted above: the following are not keys but correspond to indices of the associated array:
hmd
(17)hmd tracking data is valid (1)
translation (3)
rotation (4)
right vector (3)
up vector (3)
forward vector (3)
left_controller
/right_controller
(27)controller tracking data is valid (1)
translation (3)
rotation (4)
right vector (3)
up vector (3)
forward vector (3)
base_rotation (4)
base_rotation * controller_rotation (4)
applied_force (6) – p.getConstraintState(controller_constraint_id)
vr_button_data
trigger fraction (1) — open: 0 -> closed: 1
touchpad x position (1) – left: -1 -> right: 1
touchpad y position (1) – bottom: -1 -> right: 1
Vr_eye_tracking_data
eye-tracking data is valid (1)
origin of gaze in world space (3)
direction vector of gaze in world space (3)
left pupil diameter (1)
right pupil diameter (1)
vr_position_data
(12)position of the system in iGibson space (3)
offset of the system from the origin (3)
applied force to vr body (6) – p.getConstraintState(body_constraint_id)
torso_tracker
(8)torso tracker is valid (1)
position (3)
rotation (4)
Add your own demos!
Coming soon!