Running a sample
Unity ships the ML-Agents package with a number of prepared samples that demonstrate various aspects of learning and training scenarios. Let's open up Unity and load up a sample project and get a feel for how the ML-Agents run by following this exercise:
- Open the Unity editor and go to the starting
Project
dialog.
- Click the
Open
button at the top of the dialog and navigate to and select theML-Agents/ml-agents/unity-environment
folder, as shown in the following screenshot:

Loading the unity-environment project into the editor
- This will load the
unity-environment
project into the Unity editor. Depending on the Unity version you are using, you may get a warning that the version needs to be upgraded. As long as you are using a recent version of Unity, you can just click Continue. If you do experience problems, try upgrading or downgrading your version of Unity. - Locate the
Scene
file in theAssets/ML-Agents/Examples/3DBall
folder of theProject
window, as shown in the following screenshot:

Locating the example scene file in the 3DBall folder
- Double-click the
3DBall
scene file to open the scene in the editor. - Press the Play button at the top center of the editor to run the scene. You will see that the scene starts running and that balls are being dropped, but the balls just fall off the platforms. This is because the scene starts up in
Player
mode, which means you can control the platforms with keyboard input. Try to balance the balls on the platform using the arrow keys on the keyboard. - When you are done running the scene, click the Play button again to stop the scene.
Setting the agent Brain
As you witnessed, the scene is currently set for Player control, but obviously we want to see how some of this ML-Agents stuff works. In order to do that, we need to change the Brain type the agent is using. Follow along to switch the Brain type in the 3D Ball agent:
- Locate the
Ball3DAcademy
object in theHierarchy
window and expand it to reveal theBall3DBrain
object. - Select the
Ball3DBrain
object and then look to theInspector
window, as shown in the following screenshot:

Switching the Brain on the Ball3DBrain object
- Switch the Brain component, as shown in the preceding excerpt, to the
Heuristic
setting. TheHeuristic
brain setting is for ML-Agents that are internally coded within Unity scripts in a heuristic manner. Heuristic programming is nothing more than selecting a simpler quicker solution when a classic, in our case, ML algorithms, may take longer. Writing a Heuristic brain can often help you better define a problem and it is a technique we will use later in this chapter. The majority of current game AIs fall within the category of using Heuristic algorithms.
- Press Play to run the scene. Now, you will see the platforms balancing each of the balls – very impressive for a heuristic algorithm. Next, we want to open the script with the heuristic brain and take a look at some of the code.
Note
You may need to adjust the Rotation Speed property, up or down, on the Ball 3D Decision (Script)
. Try a value of .5
for a rotation speed if the Heuristics brain seems unable to effectively balance the balls. The Rotation Speed is hidden in the preceding screen excerpt.
- Click the Gear icon beside the
Ball 3D Decision
(Script)
, and from the context menu, selectEdit Script
, as shown in the following screenshot:

Editing the Ball 3D Decision script
- Take a look at the
Decide
method in the script as follows:
public float[] Decide( List<float> vectorObs, List<Texture2D> visualObs, float reward, bool done, List<float> memory) { if (gameObject.GetComponent<Brain() .brainParameters.vectorActionSpaceType == SpaceType.continuous) { List<float> act = new List<float>(); // state[5] is the velocity of the ball in the x orientation. // We use this number to control the Platform's z axis rotation speed, // so that the Platform is tilted in the x orientation correspondingly. act.Add(vectorObs[5] * rotationSpeed); // state[7] is the velocity of the ball in the z orientation. // We use this number to control the Platform's x axis rotation speed, // so that the Platform is tilted in the z orientation correspondingly. act.Add(-vectorObs[7] * rotationSpeed); return act.ToArray(); } // If the vector action space type is discrete, then we don't do anything. return new float[1] { 1f }; }
- We will cover more details about what the inputs and outputs of this method mean later. For now though, look at how simple the code is. This is the heuristic brain that is balancing the balls on the platform, which is fairly impressive when you see the code. The question that may just hit you is: why are we bothering with ML programming, then? The simple answer is that the 3D ball problem is deceptively simple and can be easily modeled with eight states. Take a look at the code again and you can see that only eight states are used (
0
to7
), with each state representing the direction the ball is moving in. As you can see, this works well for this problem but when we get to more complex examples, we may have millions upon billions of states – hardly anything we could easily solve using heuristic methods.
Heuristic brains should not be confused with Internal brains, which we will get to in Chapter 6, Terrarium Revisited – Building a Multi-Agent Ecosystem. While you could replace the heuristic code in the 3D ball example with an ML algorithm, that is not the best practice for running an advanced ML such as Deep Learning algorithms, which we will discover in Chapter 3, Deep Reinforcement Learning with Python.
In the next section, we are going to modify the Basic example in order to get a better feel for how ML-Agents components work together.