gym-ignition¶

The previous sections reported the usage of ScenarIO to perform rigid-body simulations of any kind. The gym_ignition Python package provides boilerplate code that use ScenarIO to create environments for reinforcement learning research compatible with OpenAI Gym.

Beyond the abstractions provided by ScenarIO, gym-ignition introduces the following:

Runtime: Base class to abstract the runtime of an environment. It provides the code that steps the simulator for simulated environments or deals with real-time execution for environments running on real robots. The implementation for Ignition Gazebo is GazeboRuntime.
Task: Base class providing the structure of the decision-making logic. The code of the task must be independent from the runtime, and only the ScenarIO APIs should be used. The active runtime will then execute the task on either simulated or real worlds.
gym_ignition.randomizers: Utilities to develop gym.Wrapper classes that randomize the environment every rollout. The implementation for Ignition Gazebo is GazeboEnvRandomizer.
gym_ignition.rbd: Utilities commonly used in robotic environments, like inverse kinematics and rigid-body dynamics algorithms. Refer to InverseKinematicsNLP and KinDynComputations for more details.

Tip

If you want to learn more about iDynTree, the two classes we mainly use are iDynTree::KinDynComputations (docs) and iDynTree::InverseKinematics (docs).

The theory and notation is summarized in Multibody dynamics notation.

You can find demo environments created with gym-ignition in the gym_ignition_environments folder. These examples show how to structure a new standalone Python package containing the environment with your robots.

For example, taking the cartpole balancing problem with discrete actions, the components you need to implement are the following:

A model CartPole (model/cartpole.py)
A task CartPoleDiscreteBalancing (cartpole_discrete_balancing.py)
A randomizer CartpoleEnvRandomizer (randomizers/cartpole.py)
Environment registration as done in __init__.py

With all these resources in place, you can run a random policy of the environment as shown in launch_cartpole.py.