We used Mujoco, a physics simulation environment, to train a model of the Hexapod. With Mujoco it is possible to create custom models by defining a model environment and a xml-File describing the model's physical appearance.
Mujoco models are described in a proprietary .xml format (MJCF). Each mujoco model starts with a set of compiler settings, then starts the definition of the model itself.
<worldbody>
<light cutoff="100" diffuse="1 1 1" dir="-0 0 -1.3" directional="true" exponent="1" pos="0 0 1.3" specular="0.1 0.1 0.1"/>
<body name="torso" pos="0 0 0.75">
<camera name="track" mode="trackcom" pos="0 -3 -1.3" xyaxes="1 0 0 0 0 1"/>
<geom name="torso_geom" pos="0 0 0" size="1.15 0.2" type="cylinder" rgba="1 1 1 1" mass="17"/>
<joint armature="0" damping="0" limited="false" margin="0.01" name="root" pos="0 0 0" type="free"/>
<!-- site is a special kind of geom which is kind of a placeholder for a sensor element -->
<site name=gyrosite size="0.1 0.1 0.2" type="box" rgba="0.1 0.1 0.1 1"/>
<body name="position_ref_point" pos="0 0 -0.2">
<geom name="position_ref_point_geom" size="0.01 0.01 0.01" type="box" rgba="1 0.0 0.0 1"/>
</body>
<!-- LEG1 color: red -->
<body name="leg1">
<body name="coxa1" pos="1.2 0.0 0.0" euler="0 0 0">
<joint axis="0 0 1" name="coxa_1" pos="0 0.0 0.0" range="-45 45" type="hinge"/>
<geom name="coxa_1_geom" size="0.25 0.25 0.2" type="box" rgba="1 0.0 0.0 1"/>
<geom name="leg1_bolt1" size="0.05 0.25" type="cylinder"/>
<!-- ... and so on .. -->
<!-- ... Leg 2 to 6 ...-->
</body>
</worldbody>
After the world body the actuators (i.e. the motors) get defined:
<actuator>
<motor ctrllimited="false" ctrlrange="-1.0 1.0" joint="coxa_1" gear="100"/>
<motor ctrllimited="false" ctrlrange="-1.0 1.0" joint="femur_1" gear="100"/>
<motor ctrllimited="false" ctrlrange="-1.0 1.0" joint="tibia_1" gear="100"/>
<!-- ... Motors for leg 2 to 6 ...-->
</actuator>
At the end some sensors are defined:
<sensor>
<gyro name=gyrosensor site=gyrosite/>
<accelerometer name=accelerometer site=gyrosite/>
</sensor>
The resulting model looks like this:
In order to use the Mujoco model for reinforcement learning, it is necessary to create a Python environment of the model.
The environment and therefore the model can be used like this:
import gym
import HexapodEnvironment
import time
env = gym.make('Hexapod-v0')
obs = env.reset()
obs_dim = env.observation_space.shape[0]
act_dim = env.action_space.shape[0]
for i in range(500):
env.render()
action = np.array([0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0])
action = action.reshape((1,-1)).astype(np.float32)
obs, reward, done, info = env.step(np.squeeze(action, axis=0))
time.sleep(.2)