Packt+ | Advance your knowledge in tech

You're reading from Learn Unity ML-Agents ??? Fundamentals of Unity Machine Learning Incorporate new powerful ML algorithms such as Deep Reinforcement Learning for games

Product type Paperback

Published in Jun 2018

Publisher Packt

ISBN-13 9781789138139

Length 204 pages

Edition 1st Edition

Languages

Tools

Deep Reinforcement Learning

Concepts

Deep Reinforcement Learning

Author (1):

Micheal Lanham

View More author details

Table of Contents (13) Chapters

Title Page

Dedication

Packt Upsell

Contributors

Preface

1. Introducing Machine Learning and ML-Agents FREE CHAPTER

2. The Bandit and Reinforcement Learning

3. Deep Reinforcement Learning with Python

4. Going Deeper with Deep Learning

5. Playing the Game

6. Terrarium Revisited – A Multi-Agent Ecosystem

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Running a sample

Unity ships the ML-Agents package with a number of prepared samples that demonstrate various aspects of learning and training scenarios. Let's open up Unity and load up a sample project and get a feel for how the ML-Agents run by following this exercise:

Open the Unity editor and go to the starting Project dialog.

Click the Open button at the top of the dialog and navigate to and select the ML-Agents/ml-agents/unity-environment folder, as shown in the following screenshot:

Loading the unity-environment project into the editor

This will load the unity-environment project into the Unity editor. Depending on the Unity version you are using, you may get a warning that the version needs to be upgraded. As long as you are using a recent version of Unity, you can just click Continue. If you do experience problems, try upgrading or downgrading your version of Unity.
Locate the Scene file in the Assets/ML-Agents/Examples/3DBall folder of the Project window, as shown in the following screenshot:

Locating the example scene file in the 3DBall folder

Double-click the 3DBall scene file to open the scene in the editor.
Press the Play button at the top center of the editor to run the scene. You will see that the scene starts running and that balls are being dropped, but the balls just fall off the platforms. This is because the scene starts up in Player mode, which means you can control the platforms with keyboard input. Try to balance the balls on the platform using the arrow keys on the keyboard.
When you are done running the scene, click the Play button again to stop the scene.

Setting the agent Brain

As you witnessed, the scene is currently set for Player control, but obviously we want to see how some of this ML-Agents stuff works. In order to do that, we need to change the Brain type the agent is using. Follow along to switch the Brain type in the 3D Ball agent:

Locate the Ball3DAcademy object in the Hierarchy window and expand it to reveal the Ball3DBrain object.
Select the Ball3DBrain object and then look to the Inspector window, as shown in the following screenshot:

Switching the Brain on the Ball3DBrain object

Switch the Brain component, as shown in the preceding excerpt, to the Heuristic setting. The Heuristic brain setting is for ML-Agents that are internally coded within Unity scripts in a heuristic manner. Heuristic programming is nothing more than selecting a simpler quicker solution when a classic, in our case, ML algorithms, may take longer. Writing a Heuristic brain can often help you better define a problem and it is a technique we will use later in this chapter. The majority of current game AIs fall within the category of using Heuristic algorithms.

Press Play to run the scene. Now, you will see the platforms balancing each of the balls – very impressive for a heuristic algorithm. Next, we want to open the script with the heuristic brain and take a look at some of the code.

Note

You may need to adjust the Rotation Speed property, up or down, on the Ball 3D Decision (Script). Try a value of .5 for a rotation speed if the Heuristics brain seems unable to effectively balance the balls. The Rotation Speed is hidden in the preceding screen excerpt.

Click the Gear icon beside the Ball 3D Decision (Script), and from the context menu, select Edit Script, as shown in the following screenshot:

Editing the Ball 3D Decision script

Take a look at the Decide method in the script as follows:

      public float[] Decide(
              List<float> vectorObs,
              List<Texture2D> visualObs,
              float reward,
              bool done,
              List<float> memory)
          {
              if 
              (gameObject.GetComponent<Brain()
              .brainParameters.vectorActionSpaceType
                 == SpaceType.continuous)
              {
                  List<float> act = new List<float>();

        // state[5] is the velocity of the ball in the x orientation. 
        // We use this number to control the Platform's z axis rotation 
         speed, 
        // so that the Platform is tilted in the x orientation 
        correspondingly. 
          act.Add(vectorObs[5] * rotationSpeed);

        // state[7] is the velocity of the ball in the z orientation. 
        // We use this number to control the Platform's x axis rotation 
        speed, 
        // so that the Platform is tilted in the z orientation 
        correspondingly. 
          act.Add(-vectorObs[7] * rotationSpeed);

          return act.ToArray();
          }

          // If the vector action space type is discrete, then we don't do 
          anything. 
          return new float[1] { 1f };
          }

We will cover more details about what the inputs and outputs of this method mean later. For now though, look at how simple the code is. This is the heuristic brain that is balancing the balls on the platform, which is fairly impressive when you see the code. The question that may just hit you is: why are we bothering with ML programming, then? The simple answer is that the 3D ball problem is deceptively simple and can be easily modeled with eight states. Take a look at the code again and you can see that only eight states are used (0 to 7), with each state representing the direction the ball is moving in. As you can see, this works well for this problem but when we get to more complex examples, we may have millions upon billions of states – hardly anything we could easily solve using heuristic methods.

Heuristic brains should not be confused with Internal brains, which we will get to in Chapter 6, Terrarium Revisited – Building a Multi-Agent Ecosystem. While you could replace the heuristic code in the 3D ball example with an ML algorithm, that is not the best practice for running an advanced ML such as Deep Learning algorithms, which we will discover in Chapter 3, Deep Reinforcement Learning with Python.

In the next section, we are going to modify the Basic example in order to get a better feel for how ML-Agents components work together.