We used Mujoco, a physics simulation environment, to train a model of the Hexapod. With Mujoco it is possible to create custom models by defining a model environment and a xml-File describing the model's physical appearance.

Mujoco models are described in a proprietary .xml format (MJCF). Each mujoco model starts with a set of compiler settings, then starts the definition of the model itself.

<worldbody>
    <light cutoff="100" diffuse="1 1 1" dir="-0 0 -1.3" directional="true" exponent="1" pos="0 0 1.3" specular="0.1 0.1 0.1"/>

    <body name="torso" pos="0 0 0.75">
       <camera name="track" mode="trackcom" pos="0 -3 -1.3" xyaxes="1 0 0 0 0 1"/>
       <geom name="torso_geom" pos="0 0 0" size="1.15 0.2" type="cylinder" rgba="1 1 1 1" mass="17"/>
       <joint armature="0" damping="0" limited="false" margin="0.01" name="root" pos="0 0 0" type="free"/>
       <!-- site is a special kind of geom which is kind of a placeholder for a sensor element -->
       <site name=gyrosite size="0.1 0.1 0.2" type="box" rgba="0.1 0.1 0.1 1"/>
       <body name="position_ref_point" pos="0 0 -0.2">
       <geom name="position_ref_point_geom" size="0.01 0.01 0.01" type="box" rgba="1 0.0 0.0 1"/>
    </body>
  
        <!-- LEG1 color: red -->
        <body name="leg1">
            <body name="coxa1" pos="1.2 0.0 0.0" euler="0 0 0">
                <joint axis="0 0 1" name="coxa_1" pos="0 0.0 0.0" range="-45 45" type="hinge"/>
                <geom name="coxa_1_geom" size="0.25 0.25 0.2" type="box" rgba="1 0.0 0.0 1"/>
                <geom name="leg1_bolt1" size="0.05 0.25" type="cylinder"/>
                <!-- ... and so on .. -->
<!-- ... Leg 2 to 6 ...--> </body> </worldbody>

After the world body the actuators (i.e. the motors) get defined:

<actuator>
    <motor ctrllimited="false" ctrlrange="-1.0 1.0" joint="coxa_1" gear="100"/>
    <motor ctrllimited="false" ctrlrange="-1.0 1.0" joint="femur_1" gear="100"/>
    <motor ctrllimited="false" ctrlrange="-1.0 1.0" joint="tibia_1" gear="100"/>
    
    <!-- ... Motors for leg 2 to 6 ...-->
</actuator>

At the end some sensors are defined:

<sensor>
    <gyro name=gyrosensor site=gyrosite/>
    <accelerometer name=accelerometer site=gyrosite/>
</sensor>

The resulting model looks like this:

 

In order to use the Mujoco model for reinforcement learning, it is necessary to create a Python environment of the model.

The environment and therefore the model can be used like this:

import gym
import HexapodEnvironment
import time

env = gym.make('Hexapod-v0')
obs = env.reset()
obs_dim = env.observation_space.shape[0]
act_dim = env.action_space.shape[0]

for i in range(500):
    env.render()
    action = np.array([0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0])
    action = action.reshape((1,-1)).astype(np.float32)
    obs, reward, done, info = env.step(np.squeeze(action, axis=0))
    time.sleep(.2)



Problems with Mujoco

  • No collision detection
  • Poor documentation -> No tutorials. Examples do not cover all elements (e.g. Sensors).