Implementing Rock-Paper-Scissors AI with UCB1
Rock-Paper-Scissors is a classic game for testing AI techniques; that's why we'll use this case scenario for the current and following recipes. We will implement what are called bandit algorithms based on the notion of exploring n-armed bandits. It's usually modeled towards a slot machine, but we will study it as an RPS player. The main idea is to get hold of the option that results in a better payoff.
In this recipe, we will learn about the UCB1 algorithm and how it works.
Getting ready...
First, we need to implement a data structure for defining our actions:
public enum RPSAction { Rock, Paper, Scissors }
How to do it...
We will implement a Bandit
class for building the logic behind the algorithm:
- Create a new class named
Bandit
:
using UnityEngine; public class Bandit : MonoBehaviour { // next steps }
- Define the required member variables:
bool init; int totalActions; int[] count; float[] score; int numActions; RPSAction lastAction; int lastStrategy...