Monday, August 31, 2020

A classic AI problem: games

 

In this post, we will study a classic AI problem: games. The simplest scenario, which we will focus on for the sake of clarity, are two-player, perfect-information games such as tic-tac-toe and chess.

Example: playing tic tac toe

Maxine and Minnie are true game enthusiasts. They just love games. Especially two-person, perfect information games such as tic-tac-toe or chess. One day they were playing tic-tac-toe. Maxine, or Max as her friends call her, was playing with X. Minnie, or Min as her friends call her, had the Os. Min had just played her turn and the board looked as follows:


two kids playing tic tac toe

Max was looking at the board and contemplating her next move, as it was her turn, when she suddenly buried her face in her hands in despair, looking quite like Garry Kasparov playing Deep Blue in 1997.

Yes, Min was close to getting three Os on the top row, but Max could easily put a stop to that plan. So why was Max so pessimistic?

Game trees

To solve games using AI, we will introduce the concept of a game tree. The different states of the game are represented by nodes in the game tree, very similar to the above planning problems. The idea is just slightly different. In the game tree, the nodes are arranged in levels that correspond to each player's turns in the game so that the “root” node of the tree (usually depicted at the top of the diagram) is the beginning position in the game. In tic-tac-toe, this would be the empty grid with no Xs or Os played yet. Under root, on the second level, there are the possible states that can result from the first player’s moves, be it X or O. We call these nodes the “children” of the root node.

Each node on the second level, would further have as its children nodes the states that can be reached from it by the opposing player's moves. This is continued, level by level, until reaching states where the game is over. In tic-tac-toe, this means that either one of the players gets a line of three and wins, or the board is full and the game ends in a tie.

Minimizing and maximizing value

In order to be able to create game AI that attempts to win the game, we attach a numerical value to each possible end result. To the board positions where X has a line of three so that Max wins, we attach the value +1, and likewise, to the positions where Min wins with three Os in a row we attach the value -1. For the positions where the board is full and neither player wins, we use the neutral value 0 (it doesn’t really matter what the values are as long as they are in this order so that Max tries to maximize the value, and Min tries to minimize it).

A sample game tree

Consider, for example, the following game tree which begins not at the root but in the middle of the game (because otherwise, the tree would be way too big to display). Note that this is different from the game shown in the illustration in the beginning of this section. We have numbered the nodes with numbers 1, 2, ..., 14.

The tree is composed of alternating layers where it is either Min's turn to place an O or Max's turn to place an X at any of the vacant slots on the board. The player whose turn it is to play next is shown at the left.

game tree 1

The game continues at the board position shown in the root node, numbered as (1) at the top, with Min’s turn to place O at any of the three vacant cells. Nodes (2)–(4) show the board positions resulting from each of the three choices respectively. In the next step, each node has two possible choices for Max to play X each, and so the tree branches again.

When starting from the above starting position, the game always ends in a row of three: in nodes (7) and (9), the winner is Max who plays with X, and in nodes (11)–(14) the winner is Min who plays with O.

Note that since the players’ turns alternate, the levels can be labeled as Min levels and Max levels, which indicates whose turn it is.

Being strategic

Consider nodes (5)–(10) on the second level from the bottom. In nodes (7) and (9), the game is over, and Max wins with three X’s in a row. The value of these positions is +1. In the remaining nodes, (5), (6), (8), and (10), the game is also practically over, since Min only needs to place her O in the only remaining cell to win. In other words, we know how the game will end at each node on the second level from the bottom. We can therefore decide that the value of nodes (5), (6), (8), and (10) is also –1.

game tree 2

Here comes the interesting part. Let’s consider the values of the nodes one level higher towards the root: nodes (2)–(4). Since we observed that both of the children of (2), i.e., nodes (5) and (6), lead to Min’s victory, we can without hesitation attach the value -1 to node (2) as well. However, for node (3), the left child (7) leads to Max’s victory, +1, but the right child (8) leads to Min winning, -1. What is the value of node (3)? Think about this for a while, keeping in mind who makes the choice at node (3).

Since it is Max’s turn to play, she will of course choose the left child, node (7). Thus, every time we reach the board position in node (3), Max can ensure victory, and we can attach the value +1 to node (3).

The same holds for node (4): again, since Max can choose where to put her X, she can always ensure victory, and we attach the value +1 to node (4).

game tree 3

Determining who wins

The most important lesson in this section is to apply the above kind of reasoning repeatedly to determine the result of the game in advance from any board position.

So far, we have decided that the value of node (2) is –1, which means that if we end up in such a board position, Min can ensure winning, and that the reverse holds for nodes (3) and (4): their value is +1, which means that Max can be sure to win if she only plays her own turn wisely.

Finally, we can deduce that since Min is an experienced player, she can reach the same conclusion, and thus she only has one real option: play the O in the middle of the board.

In the diagram below, we have included the value of each node as well as the optimal game play starting at Min's turn in the root node.

game tree 4

The value of the root node = who wins

The value of the root node, which is said to be the value of the game, tells us who wins (and how much, if the outcome is not just plain win or lose): Max wins if the value of the game is +1, Min if the value is –1, and if the value is 0, then the game will end in a draw. In other games, the value may also take other values (such as the monetary value of the chips in front of you in poker for example).

This all is based on the assumption that both players choose what is best for them and that what is best for one is the worst for the other (so called "zero-sum game").

The Minimax algorithm

We can exploit the above concept of the value of the game to obtain an algorithm called the Minimax algorithm. It guarantees optimal game play in, theoretically speaking, any deterministic, two-person, perfect-information zero-sum game. Given a state of the game, the algorithm simply computes the values of the children of the given state and chooses the one that has the maximum value if it is Max’s turn, and the one that has the minimum value if it is Min’s turn.

The algorithm can be implemented using a few lines of code. However, we will be satisfied with having grasped the main idea. If you are interested in taking a look at the actual algorithm (alert: programming required) feel free to check out, for example,  Wikipedia: Minimax.

As stated above, the Minimax algorithm can be used to implement optimal game play in any deterministic, two-player, perfect-information zero-sum game. Such games include tic-tac-toe, connect four, chess, Go, etc. Rock-paper-scissors is not in this class of games since it involves information hidden from the other player; nor are Monopoly or backgammon which are not deterministic. So as far as this topic is concerned, is that all folks, can we go home now? The answer is that in theory, yes, but in practice, no.

A few more tricks are needed to manage massive game trees. Many of them were crucial elements in IBM’s Deep Blue computer defeating the chess world champion, Garry Kasparov, in 1997.

If we can afford to explore only a small part of the game tree, we need a way to stop the Minimax algorithm before reaching an end-node, i.e., a node where the game is over and the winner is known. This is achieved by using a so called heuristic evaluation function that takes as input a board position, including the information about which player’s turn is next, and returns a score that should be an estimate of the likely outcome of the game continuing from the given board position.

The minimax algorithm presented above requires minimal changes to obtain a depth-limited version where the heuristic is returned at all nodes at a given depth limit: the depth simply refers to the number of steps that the game tree is expanded before applying a heuristic evaluation function.

Share:

Sunday, August 30, 2020

Interlude on the history of AI: starting from search

Motion Metrics | How Artificial Intelligence Revolutionized Computer  Vision: A Brief History - Motion Metrics

AI is arguably as old as computer science. Long before we had computers, people thought of the possibility of automatic reasoning and intelligence. As we already mentioned, one of the great thinkers who considered this question was Alan Turing. In addition to the Turing test, his contributions to AI, and more generally to computer science, include the insight that anything that can be computed (= calculated using either numbers or other symbols) can be automated.

The term Artificial Intelligence was coined by John McCarthy (1927-2011) – who is often also referred to as the Father of AI. The term became established when it was chosen as the topic of a summer seminar, known as the Dartmouth conference, which was organized by McCarthy and others in 1956 at Dartmouth College in New Hampshire. In the proposal to organize the seminar, McCarthy continued with Turing's argument about automated computation. The proposal contains the following crucial statement:

“The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

In other words, any element of intelligence can be broken down into small steps so that each of the steps is as such so simple and “mechanical” that it can be written down as a computer program. This statement was, and is still today, a conjecture, which means that we can’t really prove it to be true. Nevertheless, the idea is absolutely fundamental when it comes to the way we think about AI. For example, it shows that McCarthy wanted to bypass any arguments in the spirit of Searle's Chinese Room: intelligence is intelligence even if the system that implements it is just a computer that mechanically follows a program.

Why search and games became central in AI research

As computers developed to the level where it was feasible to experiment with practical AI algorithms in the 1950s, the most distinctive AI problems (besides cracking Nazi codes) were games. Games provided a convenient restricted domain that could be formalized easily. Board games such as checkers, chess, and recently quite prominently Go (an extremely complex strategy board game originating from China at least 2500 years ago), have inspired countless researchers, and continue to do so.

Closely related to games, search and planning techniques were an area where AI lead to great advances in the 1960s: algorithms with names such as the Minimax algorithm or Alpha-Beta Pruning, which were developed then, are still the basis for game playing AI, although of course more advanced variants have been proposed over the years. In the next posts, we will study games and planning problems on a conceptual level.

Share:

Saturday, August 29, 2020

Search and problem solving

 How Artificial Intelligence Is Transforming Business - Business News Daily

Many problems can be phrased as search problems. This requires that we start by formulating the alternative choices and their consequences.

Search in practice: getting from A to B

Imagine you’re in a foreign city, at some address (say a hotel) and want to use public transport to get to another address (a nice restaurant, perhaps). What do you do? If you are like many people, you pull out your smartphone, type in the destination and start following the instructions.

hand holding a phone using a navigation app

This question belongs to the class of search and planning problems. Similar problems need to be solved by self-driving cars, and (perhaps less obviously) AI for playing games. In the game of chess, for example, the difficulty is not so much in getting a piece from A to B as keeping your pieces safe from the opponent.

single path leading to a goal

Often there are many different ways to solve the problem, some of which may be more preferable in terms of time, effort, cost or other criteria. Different search techniques may lead to different solutions, and developing advanced search algorithms is an established research area.

four different paths leading to the same goal

We will not focus on the actual search algorithms. Instead, we emphasize the first stage of the problem solving process: defining the choices and their consequences, which is often far from trivial and can require careful thinking. We also need to define what our goal is, or in other words, when we can consider the problem solved. After this has been done, we can look for a sequence of actions that leads from the initial state to the goal.

In this post, we will discuss two kinds of problems:

  • Search and planning in static environments with only one “agent”
  • Games with two-players (“agents”) competing against each other

These categories don’t cover all possible real-world scenarios, but they are generic enough to demonstrate the main concepts and techniques.

Before we address complex search tasks like navigation or playing chess, let us start from a much simplified model in order to build up our understanding of how we can solve problems by AI.

robot in a boat looking at a chicken, fox, and chicken feed bag

Toy problem: chicken crossing

We’ll start from a simple puzzle to illustrate the ideas. A robot on a rowboat needs to move three pieces of cargo across a river: a fox, a chicken, and a sack of chicken-feed. The fox will eat the chicken if it has the chance, and the chicken will eat the chicken-feed if it has the chance, and neither is a desirable outcome. The robot is capable of keeping the animals from doing harm when it is near them, but only the robot can operate the rowboat and only two of the pieces of cargo can fit on the rowboat together with the robot. How can the robot move all of its cargo to the opposite bank of the river?

We will model the puzzle by noting that five movable things have been identified: the robot, the rowboat, the fox, the chicken, and the chicken-feed. In principle, each of the five can be on either side of the river, but since only the robot can operate the rowboat, the two will always be on the same side. Thus there are four things with two possible positions for each, which makes for sixteen combinations, which we will call states:

States of the chicken crossing puzzle

StateRobotFoxChickenChicken-feed
NNNNNear sideNear sideNear sideNear side
NNNFNear sideNear sideNear sideFar side
NNFNNear sideNear sideFar sideNear side
NNFFNear sideNear sideFar sideFar side
NFNNNear sideFar sideNear sideNear side
NFNFNear sideFar sideNear sideFar side
NFFNNear sideFar sideFar sideNear side
NFFFNear sideFar sideFar sideFar side
FNNNFar sideNear sideNear sideNear side
FNNFFar sideNear sideNear sideFar side
FNFNFar sideNear sideFar sideNear side
FNFFFar sideNear sideFar sideFar side
FFNNFar sideFar sideNear sideNear side
FFNFFar sideFar sideNear sideFar side
FFFNFar sideFar sideFar sideNear side
FFFFFar sideFar sideFar sideFar side

We have given short names to the states, because otherwise it would be cumbersome to talk about them. Now we can say that the starting state is NNNN and the goal state is FFFF, instead of something like “in the starting state, the robot is on the near side, the fox is on the near side, the chicken is on the near side, and also the chicken-feed is on the near side, and in the goal state the robot is on the far side”, and so on.

Some of these states are forbidden by the puzzle conditions. For example, in state NFFN (meaning that the robot is on the near side with the chicken-feed but the fox and the chicken are on the far side), the fox will eat the chicken, which we cannot have. Thus we can rule out states NFFN, NFFF, FNNF, FNNN, NNFF, and FFNN (you can check each one if you doubt our reasoning). We are left with the following ten states:

StateRobotFoxChickenChicken-feed
NNNNNear sideNear sideNear sideNear side
NNNFNear sideNear sideNear sideFar side
NNFNNear sideNear sideFar sideNear side
NFNNNear sideFar sideNear sideNear side
NFNFNear sideFar sideNear sideFar side
FNFNFar sideNear sideFar sideNear side
FNFFFar sideNear sideFar sideFar side
FFNFFar sideFar sideNear sideFar side
FFFNFar sideFar sideFar sideNear side
FFFFFar sideFar sideFar sideFar side

Next we will figure out which state transitions are possible, meaning simply that as the robot rows the boat with some of the items as cargo, what the resulting state is in each case. It’s best to draw a diagram of the transitions, and since in any transition the first letter alternates between N and F, it is convenient to draw the states starting with N (so the robot is on the near side) in one row and the states starting with F in another row:

chicken crossing initial state

Now let's draw the transitions. We could draw arrows that have a direction so that they point from one node to another, but in this puzzle the transitions are symmetric: if the robot can row from state NNNN to state FNFF, it can equally well row the other way from FNFF to NNNN. Thus it is simpler to draw the transitions simply with lines that don't have a direction. Starting from NNNN, we can go to FNFN, FNFF, FFNF, and FFFN:

chicken crossing first transition

Then we fill in the rest:

chicken crossing all transitions

We have now done quite a bit of work on the puzzle without seeming any closer to the solution, and there is little doubt that you could have solved the whole puzzle already by using your “natural intelligence”. But for more complex problems, where the number of possible solutions grows in the thousands and in the millions, our systematic or mechanical approach will shine since the hard part will be suitable for a simple computer to do. Now that we have formulated the alternative states and transitions between them, the rest becomes a mechanical task: find a path from the initial state NNNN to the final state FFFF.

One such path is colored in the following picture. The path proceeds from NNNN to FFFN (the robot takes the fox and the chicken to the other side), thence to NFNN (the robot takes the chicken back on the starting side) and finally to FFFF (the robot can now move the chicken and the chicken-feed to the other side).

chicken crossing with best path

State space, transitions, and costs

To formalize a planning problem, we use concepts such as the state space, transitions, and costs.

The state space

means the set of possible situations. In the chicken-crossing puzzle, the state space consisted of ten allowed states NNNN through to FFFF (but not for example NFFF, which the puzzle rules don´t allow). If the task is to navigate from place A to place B, the state space could be the set of locations defined by their (x,y) coordinates that can be reached from the starting point A. Or we could use a constrained set of locations, for example, different street addresses so that the number of possible states is limited.

Transitions

are possible moves between one state and another, such as NNNN to FNFN. It is important to note that we only count direct transitions that can be accomplished with a single action as transitions. A sequence of multiple transitions, for example, from A to C, from C to D, and from D to B (the goal), is a path rather than a transition.

Costs

refer to the fact that, oftentimes the different transitions aren´t all alike. They can differ in ways that make some transitions more preferable or cheaper (in a not necessarily monetary sense of the word) and others more costly. We can express this by associating with each transition a certain cost. If the goal is to minimize the total distance traveled, then a natural cost is the geographical distance between states. On the other hand, the goal could actually be to minimize the time instead of the distance, in which case the natural cost would obviously be the time. If all the transitions are equal, then we can ignore the costs.

Share: