a game is defined by:

  • initial state - the environment setup
  • operators - defines the legal moves
  • terminal test - test when game is over
  • utility or payoff function - which says who won, and by how much

Minimax

  • in two-player games with perfect information, minimax algorithm can determine the best move for a player (assuming the opponent plays perfectly) by enumerating the entire game tree

Alpha Beta Pruning

  • simulation
  • does the same calculation as minimax, but is more efficient by pruning irrelevant branches of search tree
function minimax(node, depth, isMaximizingPlayer, alpha, beta):
 
    if node is a leaf node :
        return value of the node
    
    if isMaximizingPlayer :
        bestVal = -INFINITY 
        for each child node :
            value = minimax(node, depth+1, false, alpha, beta)
            bestVal = max( bestVal, value) 
            alpha = max( alpha, bestVal)
            if beta <= alpha:
                break
        return bestVal
 
    else :
        bestVal = +INFINITY 
        for each child node :
            value = minimax(node, depth+1, true, alpha, beta)
            bestVal = min( bestVal, value) 
            beta = min( beta, bestVal)
            if beta <= alpha:
                break
        return bestVal
 

Cut-Off and Evaluation Function

  • usually not feasible to consider the whole game tree (even with alpha-beta pruning), so:
    • cut off the search at some point
    • apply an evaluation function to estimate the utility of the state

Games That Include an Element of Chance

  • mastering games like backgammon
  • games of chance can be handled by an extension to the minimax algorithm that evaluates chance nodes by taking the average utility of all its children nodes, weighted by the probability of the child
  • the game tree includes
    • chance nodes
    • MAX nodes
    • MIN nodes
  • expectimax value
  • expectimin value
  • expectiminimax value

Discussion

  • utility of a node expansion
    • a good search algorithm should select node expansions of high utility—that is, ones that are likely to lead to the discovery of a significantly better move
  • metareasoning
    • reasoning about reasoning
    • alpha-beta pruning is a simple kind of metareasoning

Extra: AND-OR Search Trees

  • augment search trees in the following way
    • branching caused by agent’s choice of action are called OR nodes. E.g., in vacuum world, agent chooses Left or Right or Suck
    • branching caused by environment’s choice of action are called AND nodes. E.g., in erratic vacuum world, Suck action in state 1 leads to a state in { 5, 7 }, so agent would need to find a plan for state 5 and state 7.
  • 2 kinds of nodes alternate giving AND-OR tree

AND-OR Search Algorithm

// returns: conditional plan or failure 
AND_OR_graph_search(Problem problem) {
    OR_search(problem.initial-state, [])
}
 
// returns: conditional plan or failure
OR_search(State state, List path) {
  if (state is goal) return empty plan
  if (state is in path) return failure
  for (action in state.possibleActions) {
    List resulting-states = results(state, action)
    plan = AND_search(resulting-states, problem, [state|path])
    if (plan != failure) return [action|plan]
  }
  return failure
}
 
// returns: conditional plan or failure
AND_search(State states, List path) {
  for (state-i in states) {
    plan-i = OR_search(state-i, path)
    if (plan-i == failure) return failure
  }
  return [if state-1 then plan-1
          elif state-2 then plan-2
          ...
          else plan-n];
}
 
  • the above algorithm is Depth First Search
  • can be modified to support:
    • un-informed search:
      • breadth-first-search
      • uniform-cost-search
      • iterative-deepening breadth-first-search
    • informed search:
      • greedy-search
      • A*-search
      • iterative-deepening A*-search