Due to the time and storage demand of the task at hand, it is obviously not viable to store the graph in memory. So I use the hard drive.
For some reason, I decided that I would store the graph in a series of files whose size is about 2 gigabytes. Each node has a unique 64-bit identifier. The top half of the ID is the index of the file containing the node, and the bottom half is the position within the file. This ID can be used to reference nodes, which is necessary to implement edges and category chains.
Apart from the graph files, I have a descriptor in which I store some global info like the game in question, the current graph file and its length, and the number of normal and transformation nodes. For convenience, I have a separate file which just lists all the node IDs. The ID file is not useful for normal operation but it helps when I want to traverse the graph for example for testing purposes.
The final file required by my program stores the category heads. This is a file-based mapping from category to node ID. The category chains are implemented by having each node store the ID of the next node in the chain.
Now I shall describe how exactly I store the nodes. The first byte contains a flag which marks the type of the node as either normal or transformation, the result of the node, and the player to move. Then there is a block of parent IDs. I mentioned that I need to store the parents of each node in order to properly update the results of the nodes. However, unlike the number of children (which is just the number of legal moves in the state), the number of parents cannot be known in advance, the list of parents must be allowed to grow indefinitely. So I allocate blocks of IDs, mark the empty spaces with an invalid ID, and when a new parent is added when only a single slot is left, then I allocate a new block at the end of the file, and write the ID (the file and position) of this block into the last slot. This creates a linked list of parent ID blocks, which does the job I need. So after the first byte, I have this first parent ID block. Then I have 8 bytes for a single node ID; I call this the referenced node. For normal nodes, I use this for the category chains, I write here the next node in the current chain. For transformation nodes, as they can have only a single children, I store it here. Note that transformation nodes need not be included in category chains, since if a new state is related to a state in a transformation node, then it must be related to the normal node to which the transformation node points, and this normal node will reside in a chain, so it can be found for the new state. Then, if I want to minimize the number of nodes, I can check the parents of this normal node for a transformation node which will be exactly what is needed, so I can still avoid creating multiple identical transformation nodes.
Okey, so after the referenced node, a transformation node has the transformation written into it, and that's all. A normal node has the number of moves (or equivalently, the number of child nodes), and then the IDs of the child nodes, initially containing invalid IDs until the corresponding move is expanded. For this to work, I need that the state always generates its possible moves in the same order so that I don't need to store the move itself, its index will suffice. Finally, the state itself is written to the normal node.
That's about all. This structure allows relatively straightforward implementation of everything I need. Let me just describe in terms of the stored format how I do expansion.
When the graph is requested to expand a certain child of a node, I first check if this child is not yet expanded. If the corresponding child node already exists in the graph, then this expansion request is an error. Then I read the state from the node, perform the move on it, so I obtain the state of the child node. I compute the category of the child state, and traverse the category chain, reading the state from each node in the chain to check for relations. If one is found, and it is an identical state, then this will be the child node itself. If it is a state related through a non-identity transformation, then I check the parent nodes for a transformation node with the same transformation; if one is found, then this will be the child node itself, otherwise a new transformation node is created, its child is set to the node found by the category search, and used as the child node. Finally if no related state is found, then a new normal node is created, inserted into the category chain, and used as the child node.
Either way, the child node is properly connected to the parent. Establishing this connection, and also establishing the connection between a new transformation node and its child, entails an update of the results, at least if the result of the child node is not unknown.
Also, whenever a new node is created, the corresponding node counter in the descriptor is incremented, and the new node is added to the ID list file. I might have missed something, but certainly not a lot. Except for the category head store itself, which I will get back to later.
2010. december 17., péntek
2010. december 1., szerda
GameCracker - 5. Categories
I have described why it is bad if we don't realize that a new state is the same or (symmetrically) equivalent to one that is already in the graph. I need all the help I can get when it comes to reducing the number of nodes. However, even if I do the best job possible, then - for complex games like chess - I still need an enormous number of different states to prove any interesting results. So searching the graph to see if a new state is a duplicate can be a serious issue. In particular, traversing the graph is out of the question.
My solution is the concept of categories. A category is basically a hash: a number that I assign to the states and I require that for symmetrical states, the category be the same. Then when a new state is about to be added as a result of an expansion, I query the category of this new state, and with this in my hand, I only need to check the nodes for which the category is the same (since, by definition, states with a different category cannot be related).
To implement this, I have included links in nodes which point to the next node of the same category. So for each category, I have a linked list, which I call a category chain, containing the relevant nodes. When searching for a related state, I just walk through the chain for the appropriate category, and if I don't find a match, then I can safely say that this state is new. The structure storing the category chains is a mapping from the category to the first node of the chain; this I call category heads.
Since adding new states to the graph is the elementary operation of the program, I need the search to be quick, I need the category chains to be short. This means that the categories need to be good hashes, with few duplicates, many different values for different states. On the other hand, if there are too many categories, then storing and accessing the heads quickly can also be an issue. Finally, if symmetrical equivalences are possible, then the creation of the category number itself can be difficult as well; extracting the most information possible into the value while making sure that all the transformed versions of this state produce the same category.
I will probably come back to some of these issues later, but for now, I have introduced the concept of categories, quite an important component of the program, and there are a couple more significant things I want to cover before delving into details.
My solution is the concept of categories. A category is basically a hash: a number that I assign to the states and I require that for symmetrical states, the category be the same. Then when a new state is about to be added as a result of an expansion, I query the category of this new state, and with this in my hand, I only need to check the nodes for which the category is the same (since, by definition, states with a different category cannot be related).
To implement this, I have included links in nodes which point to the next node of the same category. So for each category, I have a linked list, which I call a category chain, containing the relevant nodes. When searching for a related state, I just walk through the chain for the appropriate category, and if I don't find a match, then I can safely say that this state is new. The structure storing the category chains is a mapping from the category to the first node of the chain; this I call category heads.
Since adding new states to the graph is the elementary operation of the program, I need the search to be quick, I need the category chains to be short. This means that the categories need to be good hashes, with few duplicates, many different values for different states. On the other hand, if there are too many categories, then storing and accessing the heads quickly can also be an issue. Finally, if symmetrical equivalences are possible, then the creation of the category number itself can be difficult as well; extracting the most information possible into the value while making sure that all the transformed versions of this state produce the same category.
I will probably come back to some of these issues later, but for now, I have introduced the concept of categories, quite an important component of the program, and there are a couple more significant things I want to cover before delving into details.
2010. november 30., kedd
GameCracker - 4. Results
I've described the state graph of a game, and how results of certain states can be proven. I've also talked a little bit about how I store only a small portion of the entire state graph, as small as necessary for my purposes. From now on, when I say 'graph', I will mean the data structure of my program, consisting of normal and transformation nodes, and (usually) having a whole lot of edges that do not connect to anything. I call these edges 'unexpanded moves', for some reason.
There's an important aspect of the graph that I need to mention. In order to facilitate proofs, I attach to each node a result field, which indicates the most information that we currently know about the result of the state. In the early days of development, this field had four possible values: Unknown, WhiteWins, BlackWins, and Draw. Then I realized that by adding two more, I can further reduce the size of my graph. These additional values are WhiteWontWin (meaning that the actual result will be either Draw or BlackWins) and BlackWontWin.
In order to see how these values help, imagine that we are in a state 'A' where White is to move, and there is still a possibility that at least one of White's moves will lead to her winning the match. Let's call one of these moves (and also the state that this move leads to) 'P' for 'possibly good'. Now, if we find out that a move leads to a state in which the result is WhiteWontWin (let's call this move and state 'BD' for 'BlackWins or Draw'), then it is greatly beneficial to postpone the examination of this move and first consider 'P', which still might lead to a WhiteWins state. Here's why. If the actual result of this state 'A' is WhiteWins, then we will need to find a move which leads to a WhiteWins state; the 'BD' move is obviously not the correct one, and we won't need the actual result of this move for our proof, so calculating it is a waste of time, building this part of the graph is unnecessary. On the other hand, if the actual result of the state 'A' is not WhiteWins, then to prove it, we will need to prove that none of the moves can bring victory to White, and this includes finding out more about 'P'. So we will need to work with 'P', and there is a possibility that we won't need to work with 'BD', and this latter fact we wouldn't know if there weren't a WhiteWontWin state, if BD were still Unknown.
Now, to the main point of this entry. The graph object provides a single important operation (other than being able to more around in it). Given a node and a move from this node that isn't connected to anything yet, this operation connects this 'unexpanded' node; I call this operation 'expand'. Two very important operations need to be performed during expansion. The first is the update of the results of any nodes for which this new connection allows us to prove something more than we could before. It is of utmost importance that the result value of every single node is always up-to-date and is the most concrete possible. The second task is to try and find an existing node which is equivalent to the new state, so that we won't have multiple instances of the same subgraph. I'll come back to this latter issue, because in this entry, we're talking about results.
So how to update the results during expansion? There are two cases when we obtain additional information. One is when we add a new node to the graph which contains an end state. (In end states, we know the result by definition: it is the result of the match.) The other is when a connection is made to an existing node in which we have a non-Unknown result. Both cases might provide some information on the result of the current node, the one which is being expanded, so we update its result. A result update consists of a recalculation of the result, and - if the new result is different from our previous knowledge - a recursive update of all the parents of this node (for this reason it is important to store references to the parent nodes). Recalculation is performed along the lines described in the second entry of this series, but of course extended for the two additional -WontWin possibilities. I think it is not too interesting an algorithm to describe here; I basically drew a big table with all the possibilities regarding the results we can find among the next-states, and see what we could say. I needed to be careful so that I don't accidentally label a state with a result which is too weak (like say that it is Unknown when WhiteWontWin can be proven, or say that it is WhiteWontWin when Draw can be proven etc.). Now this algorithm is in place, and I don't need to worry about it, I just need to choose the next state and move to expand, in a way which preferably results in the smallest number of expansions.
That's all for this entry. Next time I talk about the other aspect of expansion, the search for equivalent states. The reason this will be an issue is of course the humongous number of nodes in the graph.
There's an important aspect of the graph that I need to mention. In order to facilitate proofs, I attach to each node a result field, which indicates the most information that we currently know about the result of the state. In the early days of development, this field had four possible values: Unknown, WhiteWins, BlackWins, and Draw. Then I realized that by adding two more, I can further reduce the size of my graph. These additional values are WhiteWontWin (meaning that the actual result will be either Draw or BlackWins) and BlackWontWin.
In order to see how these values help, imagine that we are in a state 'A' where White is to move, and there is still a possibility that at least one of White's moves will lead to her winning the match. Let's call one of these moves (and also the state that this move leads to) 'P' for 'possibly good'. Now, if we find out that a move leads to a state in which the result is WhiteWontWin (let's call this move and state 'BD' for 'BlackWins or Draw'), then it is greatly beneficial to postpone the examination of this move and first consider 'P', which still might lead to a WhiteWins state. Here's why. If the actual result of this state 'A' is WhiteWins, then we will need to find a move which leads to a WhiteWins state; the 'BD' move is obviously not the correct one, and we won't need the actual result of this move for our proof, so calculating it is a waste of time, building this part of the graph is unnecessary. On the other hand, if the actual result of the state 'A' is not WhiteWins, then to prove it, we will need to prove that none of the moves can bring victory to White, and this includes finding out more about 'P'. So we will need to work with 'P', and there is a possibility that we won't need to work with 'BD', and this latter fact we wouldn't know if there weren't a WhiteWontWin state, if BD were still Unknown.
![]() |
Proving something about P will be necessary for proving anything about A. However, trying to proving something more about BD might be a waste of time. |
So how to update the results during expansion? There are two cases when we obtain additional information. One is when we add a new node to the graph which contains an end state. (In end states, we know the result by definition: it is the result of the match.) The other is when a connection is made to an existing node in which we have a non-Unknown result. Both cases might provide some information on the result of the current node, the one which is being expanded, so we update its result. A result update consists of a recalculation of the result, and - if the new result is different from our previous knowledge - a recursive update of all the parents of this node (for this reason it is important to store references to the parent nodes). Recalculation is performed along the lines described in the second entry of this series, but of course extended for the two additional -WontWin possibilities. I think it is not too interesting an algorithm to describe here; I basically drew a big table with all the possibilities regarding the results we can find among the next-states, and see what we could say. I needed to be careful so that I don't accidentally label a state with a result which is too weak (like say that it is Unknown when WhiteWontWin can be proven, or say that it is WhiteWontWin when Draw can be proven etc.). Now this algorithm is in place, and I don't need to worry about it, I just need to choose the next state and move to expand, in a way which preferably results in the smallest number of expansions.
That's all for this entry. Next time I talk about the other aspect of expansion, the search for equivalent states. The reason this will be an issue is of course the humongous number of nodes in the graph.
2010. november 23., kedd
GameCracker - 3. Graphs
The data structure that I use is (a subset of) the state graph of the game. The state graph is a useful model of the game in its entirety.
Imagine that we use dots to represent states of the game; we call these the nodes of the graph. They don't have to be dots, they could be small drawings of the current board for easy visualization. Then from each of these nodes, we have arrows (or edges in graph terms) going out and pointing to different nodes. These represent the possible moves in the state corresponding to the node. It can be useful to put some textual representation of the move onto the arrow as a label just so we can easily find our way. Then we can imagine that actual matches are just a path in this graph, starting from the root node, which denotes the initial position, and moving along the edges which correspond to the moves that the players actually make. We can also think of this the other way around: if we start from the initial position and move along the edges until we reach a node which has no outgoing arrows, no legal moves, then the path we take defines a valid match that could possibly happen between two players.
Just so I can talk about stuff, let me define a couple of notions. I call a state S2 a child of another state S1, if there is an edge leading from node S1 to S2, or (in different words) if there is a move in S1 which results in S2. Appropriately, I call S1 the parent of S2, because you can get to S2 from S1 with a move. Finally, I will use the term leaf to refer to nodes which have no children, these are the end positions of the game.
Now I think is time for an illustration.
It is a very-very small part of the state graph of tic-tac-toe. In the actual graph, the root node has nine children, each of these nine children have eight children of their own, which means that if I wanted to draw the entire graph but only up to two moves, that would mean 1+9+9*8=82 nodes, so now you can probably forgive me for only drawing 9 of them, it was not only laziness.
My main point with this is that state graphs are huge. Even in a game as simple and limited as tic-tac-toe, the number of nodes is enormous. The graph, up to 5 moves (which is the length of the shortest possible match) contains 18730 nodes. If we consider chess, or some other, more interesting game, the size of the state graph is astronomical.
My program works by building up a part of this graph. It will never need the entirety of it due to how results are proven, which I described in my previous entry. As an example, suppose that chess can be won by White after the first move e2-e4. In this case, all the other 19 moves that White could start with are totally uninteresting and there is no need to consider them. This makes huge parts of the state graph irrelevant that we do not need to build up, because they are not needed for the proof of White's win in the initial position.
Another thing that I mentioned previously is that I definitely want to avoid doing the same work multiple times. In terms of the graph, this means checking if a new state that we come across is equivalent to one which we have already encountered and added to the graph. If it is, we can just draw the arrow to this existing state and not have to create two instances of everything beneath it. Example incoming!
Nothing complicated is happening. This does mean that nodes can have multiple parents, but administering this is a small price to pay.
I have described a different kind of equivalence, where two states are not the same, but are symmetrical, and thus we still don't need to examine both of them. These I handle differently, since just plainly connecting them would violate what the arrows mean. In order to note that this move leads to a state that is related an existing one through some transformation, I use a different kind of node, which I call transformation node (and, in contrast, I call normal nodes ... normal nodes). The semantics of edges leaving a transformation nodes are different, they do not model moves, they model the transformation. So each transformation node has a single arrow leaving it, and this arrow points to the other, symmetrical state. I also store the exact transformation in the node. Example to the rescue!
What I have drawn here is the same subgraph of the tic-tac-toe graph as in the first illustration, but using transformation nodes. Due to the fact that I have included only ten states in the first place, we cannot see any great benefits at first glance (we still have ten nodes), but note that originally there were seven nodes which required further work, now there are only three. Once we have added a transformation node, we basically don't have to worry about it as an individual state, only the normal node that it references. Of course when we move around in the graph, we need to take special care around transformation nodes, they will not need any extra effort to prove beyond what is needed for the corresponding normal node.
So this is the basic structure. Tune in for the next time when I describe how the result of the individual nodes are created in a way that also provides all the aspects of provability that I talked about in the previous entry.
Imagine that we use dots to represent states of the game; we call these the nodes of the graph. They don't have to be dots, they could be small drawings of the current board for easy visualization. Then from each of these nodes, we have arrows (or edges in graph terms) going out and pointing to different nodes. These represent the possible moves in the state corresponding to the node. It can be useful to put some textual representation of the move onto the arrow as a label just so we can easily find our way. Then we can imagine that actual matches are just a path in this graph, starting from the root node, which denotes the initial position, and moving along the edges which correspond to the moves that the players actually make. We can also think of this the other way around: if we start from the initial position and move along the edges until we reach a node which has no outgoing arrows, no legal moves, then the path we take defines a valid match that could possibly happen between two players.
Just so I can talk about stuff, let me define a couple of notions. I call a state S2 a child of another state S1, if there is an edge leading from node S1 to S2, or (in different words) if there is a move in S1 which results in S2. Appropriately, I call S1 the parent of S2, because you can get to S2 from S1 with a move. Finally, I will use the term leaf to refer to nodes which have no children, these are the end positions of the game.
Now I think is time for an illustration.
My main point with this is that state graphs are huge. Even in a game as simple and limited as tic-tac-toe, the number of nodes is enormous. The graph, up to 5 moves (which is the length of the shortest possible match) contains 18730 nodes. If we consider chess, or some other, more interesting game, the size of the state graph is astronomical.
My program works by building up a part of this graph. It will never need the entirety of it due to how results are proven, which I described in my previous entry. As an example, suppose that chess can be won by White after the first move e2-e4. In this case, all the other 19 moves that White could start with are totally uninteresting and there is no need to consider them. This makes huge parts of the state graph irrelevant that we do not need to build up, because they are not needed for the proof of White's win in the initial position.
Another thing that I mentioned previously is that I definitely want to avoid doing the same work multiple times. In terms of the graph, this means checking if a new state that we come across is equivalent to one which we have already encountered and added to the graph. If it is, we can just draw the arrow to this existing state and not have to create two instances of everything beneath it. Example incoming!
Nothing complicated is happening. This does mean that nodes can have multiple parents, but administering this is a small price to pay.
I have described a different kind of equivalence, where two states are not the same, but are symmetrical, and thus we still don't need to examine both of them. These I handle differently, since just plainly connecting them would violate what the arrows mean. In order to note that this move leads to a state that is related an existing one through some transformation, I use a different kind of node, which I call transformation node (and, in contrast, I call normal nodes ... normal nodes). The semantics of edges leaving a transformation nodes are different, they do not model moves, they model the transformation. So each transformation node has a single arrow leaving it, and this arrow points to the other, symmetrical state. I also store the exact transformation in the node. Example to the rescue!
What I have drawn here is the same subgraph of the tic-tac-toe graph as in the first illustration, but using transformation nodes. Due to the fact that I have included only ten states in the first place, we cannot see any great benefits at first glance (we still have ten nodes), but note that originally there were seven nodes which required further work, now there are only three. Once we have added a transformation node, we basically don't have to worry about it as an individual state, only the normal node that it references. Of course when we move around in the graph, we need to take special care around transformation nodes, they will not need any extra effort to prove beyond what is needed for the corresponding normal node.
So this is the basic structure. Tune in for the next time when I describe how the result of the individual nodes are created in a way that also provides all the aspects of provability that I talked about in the previous entry.
2010. november 20., szombat
GameCracker - 2. From simple to not so much
The little pseudo-code I gave in the previous entry is a pretty straightforward algorithm to determine the result of a game state. Nevertheless, that is not quite how I do it. It actually had never occured to me to implement the program this way, and now that I have thought about the reasons, I can give you two.
First of all, it is not sufficient for me that my program tells me the result of a state: I need it to be able to prove what it claims. By this I mean that taking on the role of the player to move in that state, the program must be able to achieve the claimed result against any opponent, against any sequence of moves; and on the other hand, it must be able to prove that a better result can always be prevented by the opponent. This is a pretty abstract formulation, so let me be more concrete. Suppose that we are in a state where it is White's turn to move. If the program says that White will win, proving this means providing a move, and (recursively) proving that the state after the move is also a win for White. To prove a draw result, it has to be able to indicate a move, proving that the state after the move is a draw, and also proving that none of White's moves will bring her victory. Finally, Black's win can be proven by showing that the state after every single one of White's moves is a loss for her.
Maybe I'm talking a bit too much about the concept of proving a result, but I'll just describe it from another aspect. Suppose that the program has solved the initial state of a game. If it turns out to be a win for White, the fact that my program can prove this claim means that by playing White, it can always win, against anyone, no matter what. In case of a draw, my program playing White can always achieve at least draw (so it will never lose), but this alone isn't enough, it needs to be shown that White has no winning move. Proving this means that the program has to also be able to achieve at least a draw when playing Black. Finally, if chess is actually winnable by Black, then of course my program must be able to win with the black pieces. And one last thing: when the program is actually playing one of these games, I want it to be able to give the correct moves in a very short time, basically using some look-up method, and not by performing a search like chess engines do. This would not be possible using the simple algorithm.
The second reason against it is that there are a lot of possibilities in a game. And I mean lot. I did say that I disregard any actual time and storage requirements, but still, there is one significant way of reducing the size of the work needed to prove results that I won't ignore. And it is to avoid calculating the result of a state multiple times.
Let us consider chess. Suppose that, after ten years of work, the program has finally obtained the result of the state after moves 1. e2-e4, e7-e6; 2. d2-d3, d7-d5. Suppose also that later the program finds itself at a point where it needs the result of the state after moves 1. e2-e4, d7-d5; 2. d2-d3, e7-e6. This is the exact same state as the previous one, but the naive little algorithm doesn't know that, and spends another ten years wth this state. I definitely want to avoid this. Now let me go one step further. There are also states that are, although different, similar enough so that the proof the program finds for the result of one is applicable to the other. I call this relationship either symmetry or equivalence. Reversi is an excellent example. In the starting position, White has four possible moves, but the resulting positions are symmetrical, basically the same. You just have to apply a transformation to the board (a rotation or a reflection) to get one from the other, and the proof for one is applicable to the other by applying the same transformation to the moves themselves.
So these were my two needs: the ability to prove claims about results of game states, and the ability to avoid computation of equivalent states. Next time, I'll describe how I've built GameCracker with these requirements in mind.
First of all, it is not sufficient for me that my program tells me the result of a state: I need it to be able to prove what it claims. By this I mean that taking on the role of the player to move in that state, the program must be able to achieve the claimed result against any opponent, against any sequence of moves; and on the other hand, it must be able to prove that a better result can always be prevented by the opponent. This is a pretty abstract formulation, so let me be more concrete. Suppose that we are in a state where it is White's turn to move. If the program says that White will win, proving this means providing a move, and (recursively) proving that the state after the move is also a win for White. To prove a draw result, it has to be able to indicate a move, proving that the state after the move is a draw, and also proving that none of White's moves will bring her victory. Finally, Black's win can be proven by showing that the state after every single one of White's moves is a loss for her.
Maybe I'm talking a bit too much about the concept of proving a result, but I'll just describe it from another aspect. Suppose that the program has solved the initial state of a game. If it turns out to be a win for White, the fact that my program can prove this claim means that by playing White, it can always win, against anyone, no matter what. In case of a draw, my program playing White can always achieve at least draw (so it will never lose), but this alone isn't enough, it needs to be shown that White has no winning move. Proving this means that the program has to also be able to achieve at least a draw when playing Black. Finally, if chess is actually winnable by Black, then of course my program must be able to win with the black pieces. And one last thing: when the program is actually playing one of these games, I want it to be able to give the correct moves in a very short time, basically using some look-up method, and not by performing a search like chess engines do. This would not be possible using the simple algorithm.
The second reason against it is that there are a lot of possibilities in a game. And I mean lot. I did say that I disregard any actual time and storage requirements, but still, there is one significant way of reducing the size of the work needed to prove results that I won't ignore. And it is to avoid calculating the result of a state multiple times.
Let us consider chess. Suppose that, after ten years of work, the program has finally obtained the result of the state after moves 1. e2-e4, e7-e6; 2. d2-d3, d7-d5. Suppose also that later the program finds itself at a point where it needs the result of the state after moves 1. e2-e4, d7-d5; 2. d2-d3, e7-e6. This is the exact same state as the previous one, but the naive little algorithm doesn't know that, and spends another ten years wth this state. I definitely want to avoid this. Now let me go one step further. There are also states that are, although different, similar enough so that the proof the program finds for the result of one is applicable to the other. I call this relationship either symmetry or equivalence. Reversi is an excellent example. In the starting position, White has four possible moves, but the resulting positions are symmetrical, basically the same. You just have to apply a transformation to the board (a rotation or a reflection) to get one from the other, and the proof for one is applicable to the other by applying the same transformation to the moves themselves.
So these were my two needs: the ability to prove claims about results of game states, and the ability to avoid computation of equivalent states. Next time, I'll describe how I've built GameCracker with these requirements in mind.
2010. november 18., csütörtök
GameCracker - 1. Introduction
Let us consider two-player games. Not games of chance, like card games, where the players do not have all the information. I am interested in games like reversi or chess, where the players perform moves (usually taking turns), and the rules of the game stipulates what moves are legal in a certain state, and exactly what new state the move will lead to. For these games, it is in theory possible to determine which player can win provided that they both play perfectly. Let me demonstrate this fairly obvious fact with some pseudocode.
There are a couple of things we need to stipulate for this algorithm to work. First of all, the game must end after a limited number of moves, otherwise our recursive function might get trapped in an infinite sequence of moves, never finding a game-over. Fortunately, tic-tac-toe cannot last longer than 9 moves, reversi takes at most 60, and chess also cannot go on indefinitely thanks to the 50-move draw rule (though I will not bother to calculate the maximum possible length of a chess match, I'm not that interested).
Another requirement is that there is always a finite number of possible moves. If this is not the case, then of course the for-loop might never terminate. There was a game popular in my school days, a variation of tic-tac-toe, where you had to get 5 in a row, and the board had no bounds. You could place your X or O however far away you wanted. (The matches could also go on forever, at least in theory. In practice, they never did.) This is the only real counter-example I can think of, but I can imagine games where the players take turns placing points on a line or in a geometric area, and of course the continuous domain would imply an infinite number of possible moves, even if the length of the match were limited.
Basically that's all we need. There are some other, fairly obvious requirements, like that the current state and the move must determine the state after the move is made (so nothing random), or that the result of a game is either a draw or one of the players winning. But now we can probably see that there is a good chunk of games that can be 'solved' using this method, by feeding the initial position to the function. A different question is whether it is practical to do so. The answer is obviously 'yes' for tic-tac-toe, and obviously 'no' for chess. But let us ignore this for now.
I have a program that can, given 'sufficient' time and storage capacity, solve such games. I call it GameCracker, and I plan to describe some aspects of it in this blog. Mostly for my own benefit, kind of as a documentation, but still, putting it out there. And I think this is good enough for an introduction.
WhoWillWin(state):This will determine the result of the game starting from an arbitrary state. Let me just write out how this works in prose. Let us call the player to move White, and her opponent Black. The WhoWillWin function checks if White has any moves which lead to her win; if there are, then she can win now by choosing that move. If she does not have a winning move, then she still might achieve a draw (for games in which there is such a thing at all). So if she has a move which leads to a draw, she should obviously choose it, since we have already established that she cannot win. If none of her moves lead to a win or a draw, that means that Black will win no matter what White moves, so the current state is a win for Black as well. It is as simple as that, whether the game in question is tic-tac-toe, reversi, or chess. (Incidentally, these are the games that I will keep mentioning as examples.)
if state.isGameOver: return state.result
canBeDraw = false
playerToMove = state.playerToMove
for each move in state.moves:
whoWinsAfterMove = WhoWillWin(state.move(move))
if whoWinsAfterMove == playerToMove: return playerToMove
if (whoWinsAfterMove == DRAW): canBeDraw = true
if canBeDraw: return DRAW
return playerToMove.opponent
There are a couple of things we need to stipulate for this algorithm to work. First of all, the game must end after a limited number of moves, otherwise our recursive function might get trapped in an infinite sequence of moves, never finding a game-over. Fortunately, tic-tac-toe cannot last longer than 9 moves, reversi takes at most 60, and chess also cannot go on indefinitely thanks to the 50-move draw rule (though I will not bother to calculate the maximum possible length of a chess match, I'm not that interested).
Another requirement is that there is always a finite number of possible moves. If this is not the case, then of course the for-loop might never terminate. There was a game popular in my school days, a variation of tic-tac-toe, where you had to get 5 in a row, and the board had no bounds. You could place your X or O however far away you wanted. (The matches could also go on forever, at least in theory. In practice, they never did.) This is the only real counter-example I can think of, but I can imagine games where the players take turns placing points on a line or in a geometric area, and of course the continuous domain would imply an infinite number of possible moves, even if the length of the match were limited.
Basically that's all we need. There are some other, fairly obvious requirements, like that the current state and the move must determine the state after the move is made (so nothing random), or that the result of a game is either a draw or one of the players winning. But now we can probably see that there is a good chunk of games that can be 'solved' using this method, by feeding the initial position to the function. A different question is whether it is practical to do so. The answer is obviously 'yes' for tic-tac-toe, and obviously 'no' for chess. But let us ignore this for now.
I have a program that can, given 'sufficient' time and storage capacity, solve such games. I call it GameCracker, and I plan to describe some aspects of it in this blog. Mostly for my own benefit, kind of as a documentation, but still, putting it out there. And I think this is good enough for an introduction.
2010. október 29., péntek
Floating Point Equivalence - 2. GPU vs CPU
As I mentioned, I aimed to achieve equivalent execution between the original algorithm and the one running on GPU. Testing numerical equality between the two main languages was just the first step.
I use OpenCL as the platform for GPU programming. The code which is intended to be executed on the graphics hardware is based on C, so it is quite easy to use and understand. The OpenCL specification defines an optional extension for double precision floating point arithmetics, and this extension is supported by the NVIDIA driver on the GPU I use (GeForce GTX 285). So my plan was to read back partial results from the GPU to my code, and compare them to the corresponding data produced by the original program.
Without further ado, I will just say what I've found out after many hours of meticulous testing. Even when considering the basic arithmetic operations, the result of calculations are not necessarily the same. I got to a piece of code which calculated an expression like "(a-b)*(a-b)+(x-y)*(x-y)", and the result was one bit off when compared to the number produced by C or Java code. At first I thought that the GPU implemented floating point rounding incorrectly, that the result of the expression fell between two neighbouring double values, and the incorrect one was selected. It turned out that this wasn't exactly the case.
I've created a very simple C program to make sure that nothing irrelevant interfered with my test. The GPU code comprises of the following function:
What can this mean? Only one thing: although these pairs of numbers appear to be the same when observed from CPU code, they are actually not. Their sums are different. My conclusion is that there are hidden details in these doubles on the GPU side. Internally, the numbers seem to be stored in a precision higher than 64 bits. It is quite possible that the same is true for the CPU, but the internal precisions have to be different. As the double data type is strictly 64-bit, this additional data cannot be observed directly, it is manifested in the results of further operations, when the numerical difference accumulates to an amount which becomes visible when rounding to the nearest floating point value representable in 64 bits.
Conclusion
Different hardware have different internal numerical precisions, even if the data types that they use are equivalent. So under these conditions, one cannot hope that performing the exact same calculations on the exact same input results in the exact same output. There is nothing that can be done about this.
I use OpenCL as the platform for GPU programming. The code which is intended to be executed on the graphics hardware is based on C, so it is quite easy to use and understand. The OpenCL specification defines an optional extension for double precision floating point arithmetics, and this extension is supported by the NVIDIA driver on the GPU I use (GeForce GTX 285). So my plan was to read back partial results from the GPU to my code, and compare them to the corresponding data produced by the original program.
Without further ado, I will just say what I've found out after many hours of meticulous testing. Even when considering the basic arithmetic operations, the result of calculations are not necessarily the same. I got to a piece of code which calculated an expression like "(a-b)*(a-b)+(x-y)*(x-y)", and the result was one bit off when compared to the number produced by C or Java code. At first I thought that the GPU implemented floating point rounding incorrectly, that the result of the expression fell between two neighbouring double values, and the incorrect one was selected. It turned out that this wasn't exactly the case.
I've created a very simple C program to make sure that nothing irrelevant interfered with my test. The GPU code comprises of the following function:
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
__kernel void test(I won't digress by explaining the details of this OpenCL C code. It takes four double arguments, and returns the results of five expressions in the buffer 'c'. The values of the parameters are declared in the C code as follows:
double a, double b, double x, double y,
__global double* c) {
c[0]=a-b;
c[1]=(a-b)*(a-b);
c[2]=x-y;
c[3]=(x-y)*(x-y);
c[4]=(a-b)*(a-b)+(x-y)*(x-y);
}
double a=22;I compared the values returned by the GPU code with those that the C program itself computed, using the non-converting cast to long to ensure that even a single-bit difference becomes visible. These are the results:
double b=12.020105095868775;
double x=201;
double y=210;
a-b CPU: 4621807799426188662, GPU: 4621807799426188662 -> 1The last column is the result of simple double equality. The value of each subexpression appears to be the same, except for the very last one. Very strange. Here were two equal numbers, (a-b)^2 and (x-y)^2, and adding them yields different results. This is when I thought that addition doesn't work correctly on the GPU side. To make sure that this is the case, I've planted the values of (a-b)^2 and (x-y)^2 as double literals into the OpenCL code, checked that they are the values that I wanted to add (by returning them in the output buffer, I saw the exact same values as in the table above), and added these. The result was the CPU version of the sum.
(a-b)^2 CPU: 4636709024391772619, GPU: 4636709024391772619 -> 1
x-y CPU: -4602115869219225600, GPU: -4602115869219225600 -> 1
(x-y)^2 CPU: 4635400285215260672, GPU: 4635400285215260672 -> 1
sum CPU: 4640558254430887142, GPU: 4640558254430887141 -> 0
What can this mean? Only one thing: although these pairs of numbers appear to be the same when observed from CPU code, they are actually not. Their sums are different. My conclusion is that there are hidden details in these doubles on the GPU side. Internally, the numbers seem to be stored in a precision higher than 64 bits. It is quite possible that the same is true for the CPU, but the internal precisions have to be different. As the double data type is strictly 64-bit, this additional data cannot be observed directly, it is manifested in the results of further operations, when the numerical difference accumulates to an amount which becomes visible when rounding to the nearest floating point value representable in 64 bits.
Conclusion
Different hardware have different internal numerical precisions, even if the data types that they use are equivalent. So under these conditions, one cannot hope that performing the exact same calculations on the exact same input results in the exact same output. There is nothing that can be done about this.
2010. október 26., kedd
Floating Point Equivalence - 1. Java vs C
Recently I've been working on creating a GPU-enabled version of an existing program implementing an image processing algorithm. The program was written in C/C++, and I decided that I could get the job done much faster and better if I rewrite the whole thing in Java. (I just love Java, love working with it, I do it very efficiently, plus the C program was not very big and required some restructuring for the GPU architecture, so I believe using it as a starting point and working with C code would have made my job much harder.)
In order to be able to validate the new code, it was an aim that it implements the same algorithm the exact same way. By comparing the data in the two programs at various steps, I could quickly see if I made a mistake. This led to the need to print exact double values in both C and Java. Then I could just print the data to a file and compare the two versions.
Using printf
In Java, whenever I need to see the "exact" value of a double, I use Double.toString. This method will produce as many digits in the fraction as necessary to differenciate my double value from any other doubles. That is, the string produces by this method is unique for each double, so if two such strings are the same, then the two doubles must also be exactly the same. However, I do not know of a similar function in C. Actually, the only C function I know of that can print doubles is printf. Fortunately, Java also has printf, their syntaxes seem pretty much the same to me, so let's use them!
For the examples below, I will use the number 1/3. It is a nice number which has infinite digits both in base-10 and base-2. Note that to produce this value in code, you need to write "double d=1.0/3;" or something similar; the expression "1/3" results in zero as it is a division between two integers, so the result is also an integer.
Now, if I write printf("%f",d) (or the Java equivalent System.out.printf((Locale)null, "%f",d)), the result is "0.333333" in both languages. Good news, we got the same result, but we should know that the format string "%f" has a parameter specifying the precision, namely the number of digits after the decimal point, and its default value is 6. This means that we lose information. It is quite possible that two doubles differ only in later digits, so to make sure we see everything there is to see, let's specify a sufficiently large precision (I do not know exactly how many digits are needed, but Double.toString usually returns about 15, so 20 should be enough).
printf("%.20f",d) -- C: "0.33333333333333331483" Java: "0.33333333333333330000"
Definitely not the same. Does this mean that "1.0/3" is a different double value in the two languages? No, actually the difference comes from the printf implementations. This realization was my main motivation for this blog post: the format string syntaxes are so similar that one would not expect different behaviour.
It seems that the C printf takes the double value, and returns the decimal number which is the closest to it among the decimals having the specified number of digits after the decimal point. This is in harmony with the manual page of this function:
Seeing the bits
Well, the 64-bit floating point representation of doubles is standard, so if we could somehow print this representation directly, that would be reliable. (Note: Java's double is always 64-bit, but C's double does not have to be, at least in theory. But for our purposes let us assume it is.)
Java has a function for just this: Double.doubleToRawLongBits. The long value that this method returns will be represented by the exact same 64 bits as the original double. Of course its value will be something totally different, but nevertheless, the long value is a suitable, unique representation of the double, and it can also be printed precisely, with no shenanigans with rounding or zero-padding: System.out.printf("%d", Double.doubleToRawLongBits(d)); -> "4599676419421066581". If we can get the same 64-bit integral value in C, we're golden. Unfortunately, I do not know of a library function which does this.
I've found the solution here. The key concept is to change the type without numeric casts, and there are two ways to do it. One is through pointers. [Note: I use "long long" for the C data type. I believe it is the one which is most commonly 64-bit, although, once again, the specification does not require it to be.] We first wander into pointerville by taking the address of our double variable: &d; the type of this expression is double*. Then we cast this double pointer to a long long pointer: (long long*)&d. This cast retains the value stored in the variable d, and now we have a long long pointer to it, so all we need to do is dereference it: *((long long*)&d). It is not very pretty, but also not complicated, and it works.
So the solution: printf("%lld", *((long long*)&d)); mind you, "%d" does not work correctly, we need the "ll" length modifier to tell printf about the exact type of the argument, otherwise it will assume it is an int.
The other way is to use a union:
Conclusion
The main thing I've learned with this little adventure was that the string formatting facilities of C and Java differ, even though the syntax is similar. One has to be especially careful when trying to print the exact value of a floating point number.
In order to be able to validate the new code, it was an aim that it implements the same algorithm the exact same way. By comparing the data in the two programs at various steps, I could quickly see if I made a mistake. This led to the need to print exact double values in both C and Java. Then I could just print the data to a file and compare the two versions.
Using printf
In Java, whenever I need to see the "exact" value of a double, I use Double.toString. This method will produce as many digits in the fraction as necessary to differenciate my double value from any other doubles. That is, the string produces by this method is unique for each double, so if two such strings are the same, then the two doubles must also be exactly the same. However, I do not know of a similar function in C. Actually, the only C function I know of that can print doubles is printf. Fortunately, Java also has printf, their syntaxes seem pretty much the same to me, so let's use them!
For the examples below, I will use the number 1/3. It is a nice number which has infinite digits both in base-10 and base-2. Note that to produce this value in code, you need to write "double d=1.0/3;" or something similar; the expression "1/3" results in zero as it is a division between two integers, so the result is also an integer.
Now, if I write printf("%f",d) (or the Java equivalent System.out.printf((Locale)null, "%f",d)), the result is "0.333333" in both languages. Good news, we got the same result, but we should know that the format string "%f" has a parameter specifying the precision, namely the number of digits after the decimal point, and its default value is 6. This means that we lose information. It is quite possible that two doubles differ only in later digits, so to make sure we see everything there is to see, let's specify a sufficiently large precision (I do not know exactly how many digits are needed, but Double.toString usually returns about 15, so 20 should be enough).
printf("%.20f",d) -- C: "0.33333333333333331483" Java: "0.33333333333333330000"
Definitely not the same. Does this mean that "1.0/3" is a different double value in the two languages? No, actually the difference comes from the printf implementations. This realization was my main motivation for this blog post: the format string syntaxes are so similar that one would not expect different behaviour.
It seems that the C printf takes the double value, and returns the decimal number which is the closest to it among the decimals having the specified number of digits after the decimal point. This is in harmony with the manual page of this function:
f, F The double argument is rounded and converted to decimal notation in the style [-]ddd.ddd, where the number of digits after the decimal-point character is equal to the precision specification.In Java, however, we only get a limited number valuable digits even if we specify a high precision. The documentation specifies the exact behaviour:
If the precision is less than the number of digits which would appear after the decimal point in the string returned byThis unfortunately means that we cannot use "%f" for the purpose of comparing doubles. What to do then?Float.toString(float)
orDouble.toString(double)
respectively, then the value will be rounded using the round half up algorithm. Otherwise, zeros may be appended to reach the precision.
Seeing the bits
Well, the 64-bit floating point representation of doubles is standard, so if we could somehow print this representation directly, that would be reliable. (Note: Java's double is always 64-bit, but C's double does not have to be, at least in theory. But for our purposes let us assume it is.)
Java has a function for just this: Double.doubleToRawLongBits. The long value that this method returns will be represented by the exact same 64 bits as the original double. Of course its value will be something totally different, but nevertheless, the long value is a suitable, unique representation of the double, and it can also be printed precisely, with no shenanigans with rounding or zero-padding: System.out.printf("%d", Double.doubleToRawLongBits(d)); -> "4599676419421066581". If we can get the same 64-bit integral value in C, we're golden. Unfortunately, I do not know of a library function which does this.
I've found the solution here. The key concept is to change the type without numeric casts, and there are two ways to do it. One is through pointers. [Note: I use "long long" for the C data type. I believe it is the one which is most commonly 64-bit, although, once again, the specification does not require it to be.] We first wander into pointerville by taking the address of our double variable: &d; the type of this expression is double*. Then we cast this double pointer to a long long pointer: (long long*)&d. This cast retains the value stored in the variable d, and now we have a long long pointer to it, so all we need to do is dereference it: *((long long*)&d). It is not very pretty, but also not complicated, and it works.
So the solution: printf("%lld", *((long long*)&d)); mind you, "%d" does not work correctly, we need the "ll" length modifier to tell printf about the exact type of the argument, otherwise it will assume it is an int.
The other way is to use a union:
union {The union achieves the same thing that the pointer casts did: it reinterprets an existing 64-bit value as a different type without changing it. It is a bit more verbose, so I just used the first solution myself, but it is worth mentioning, that Java actually implements Double.doubleToRawLongBits using the union method.
double a;
long long b;
} u;
u.a=d;
printf("%lld", u.b);
Conclusion
The main thing I've learned with this little adventure was that the string formatting facilities of C and Java differ, even though the syntax is similar. One has to be especially careful when trying to print the exact value of a floating point number.
2010. október 23., szombat
Hello World
I've just spent about one day on a seemingly relatively simple software problem, and I've learnt a couple of things. I realized that it would be useful to publish my experiences. Even if no-one else benefits from them, I could revisit these moments I would otherwise possibly forget. I also incidentally stumbled upon Dustin's blog entry encouraging just this behaviour, and I completely agree with it, so I finally budged. I don't know how often I'll post, but I will try to get myself to write about anything that I find worthy to remember, at least for myself.
PS. Wow, my English is rusty.
PS. Wow, my English is rusty.
Feliratkozás:
Bejegyzések (Atom)