Difference between revisions of "CISC181 S2017 Lab4"

From class_wiki
Jump to: navigation, search
(Simple tic-tac-toe AI)
(Simple tic-tac-toe AI)
Line 23: Line 23:
 
The idea behind our Monte Carlo approach to tic-tac-toe move selection is simple.  Some moves may lead to a certain win or defeat (either immediately or inevitably), while others simply make victory more or less likely.  Given a choice of ''M'' moves (aka open spaces) on the board, we would like to assess the likelihood of victory or defeat resulting from a '''first move m''' in each of the ''M'' spaces. Whichever candidate move has the highest likelihood of victory is the one we make.  Then the opponent makes their move and we repeat the process until the game ends.
 
The idea behind our Monte Carlo approach to tic-tac-toe move selection is simple.  Some moves may lead to a certain win or defeat (either immediately or inevitably), while others simply make victory more or less likely.  Given a choice of ''M'' moves (aka open spaces) on the board, we would like to assess the likelihood of victory or defeat resulting from a '''first move m''' in each of the ''M'' spaces. Whichever candidate move has the highest likelihood of victory is the one we make.  Then the opponent makes their move and we repeat the process until the game ends.
  
It is not always possible or easy to calculate such a win likelihood exactly, so we try to estimate the likelihood with a probabilistic method.  We assume that after our first move ''m'' (out of ''M'' available), all subsequent moves by us and by our opponent will be made randomly (but legally).  Given this, we can "simulate" a random conclusion to the game after our first move, and then record whether a win, loss, or tie results.  By '''repeatedly''' running this simulation ''T'' times with the same first move, we can tabulate the statistics of wins, losses, and ties for all kinds of possible endgames after our hypothetical first move.  Suppose we assign a score of +1 to each win, -10 to each loss, and 0 to each tie.  If ''T''=10000 and we win 6000 times, lose 3300 times, and tie 700 times, then we can calculate an aggregate "win value" ''W'' for move ''m'' of  
+
It is not always possible or easy to calculate such a win likelihood exactly, so we try to estimate the likelihood with a probabilistic method.  We assume that after our first move ''m'' (out of ''M'' available), all subsequent moves by us and by our opponent will be made randomly (but legally).  Given this, we can "simulate" a random conclusion to the game after our first move, and then record whether a win, loss, or tie results.  By '''repeatedly''' running this simulation ''T'' times with the same first move, we can tabulate the statistics of wins, losses, and ties for all kinds of possible endgames after our hypothetical first move.  Suppose we assign a score of +1 to each win, -10 to each loss, and 0 to each tie.  If ''T''=10000 and we win 6000 times, lose 3300 times, and tie 700 times, then we can calculate an aggregate "win value" ''W'' for move ''m'' of:
  
 
  ''W''(''m'') = 6000 * 1 + 3300 * -10 + 700 * 0 = 6000 - 33000 = -27000
 
  ''W''(''m'') = 6000 * 1 + 3300 * -10 + 700 * 0 = 6000 - 33000 = -27000

Revision as of 14:01, 6 March 2017

Preliminaries

  • Make a new project with n = 4 (following these instructions)
  • Name your main class "Lab4" (when creating a new module in the instructions above, in the Java class name field)

Instructions

Add TicTacToe.java and TicTacToeBoard.java to your project folder. Make your main() function in Lab4.java look like this:

TicTacToe TTT = new TicTacToe();
TTT.run_tournament();
 

This will start a "tournament", in which the user can choose an opponent for each game, decide whether to go first or second, or quit. Code is already written to handle the logic of each game, including checking for a winner and preventing invalid moves. Currently there are two pre-written options for opponents: human, in which you will alternate choosing moves manually; and random, in which the computer chooses a random open space on its turn with no intelligence whatsoever.

Your job in this lab is to add a third opponent semi-intelligent option, which we will call Monte Carlo (explained below). Specifically, you will fill in the body of the chooseMonteCarloMove() method in TicTacToeBoard class. You should also add your name in a comment before the class body.

Simple tic-tac-toe AI

The idea behind our Monte Carlo approach to tic-tac-toe move selection is simple. Some moves may lead to a certain win or defeat (either immediately or inevitably), while others simply make victory more or less likely. Given a choice of M moves (aka open spaces) on the board, we would like to assess the likelihood of victory or defeat resulting from a first move m in each of the M spaces. Whichever candidate move has the highest likelihood of victory is the one we make. Then the opponent makes their move and we repeat the process until the game ends.

It is not always possible or easy to calculate such a win likelihood exactly, so we try to estimate the likelihood with a probabilistic method. We assume that after our first move m (out of M available), all subsequent moves by us and by our opponent will be made randomly (but legally). Given this, we can "simulate" a random conclusion to the game after our first move, and then record whether a win, loss, or tie results. By repeatedly running this simulation T times with the same first move, we can tabulate the statistics of wins, losses, and ties for all kinds of possible endgames after our hypothetical first move. Suppose we assign a score of +1 to each win, -10 to each loss, and 0 to each tie. If T=10000 and we win 6000 times, lose 3300 times, and tie 700 times, then we can calculate an aggregate "win value" W for move m of:

W(m) = 6000 * 1 + 3300 * -10 + 700 * 0 = 6000 - 33000 = -27000

For more background, see Monte Carlo tree search (FYI, we are doing the "pure", non-tree version here).

Finishing up

Remember to use proper naming and formatting style throughout your code, and submit ONLY TicTacToeBoard.java on Sakai by Friday, March 10