2010. december 17., péntek

GameCracker - 6. Persistence

Due to the time and storage demand of the task at hand, it is obviously not viable to store the graph in memory. So I use the hard drive.

For some reason, I decided that I would store the graph in a series of files whose size is about 2 gigabytes. Each node has a unique 64-bit identifier. The top half of the ID is the index of the file containing the node, and the bottom half is the position within the file. This ID can be used to reference nodes, which is necessary to implement edges and category chains.

Apart from the graph files, I have a descriptor in which I store some global info like the game in question, the current graph file and its length, and the number of normal and transformation nodes. For convenience, I have a separate file which just lists all the node IDs. The ID file is not useful for normal operation but it helps when I want to traverse the graph for example for testing purposes.

The final file required by my program stores the category heads. This is a file-based mapping from category to node ID. The category chains are implemented by having each node store the ID of the next node in the chain.

Now I shall describe how exactly I store the nodes. The first byte contains a flag which marks the type of the node as either normal or transformation, the result of the node, and the player to move. Then there is a block of parent IDs. I mentioned that I need to store the parents of each node in order to properly update the results of the nodes. However, unlike the number of children (which is just the number of legal moves in the state), the number of parents cannot be known in advance, the list of parents must be allowed to grow indefinitely. So I allocate blocks of IDs, mark the empty spaces with an invalid ID, and when a new parent is added when only a single slot is left, then I allocate a new block at the end of the file, and write the ID (the file and position) of this block into the last slot. This creates a linked list of parent ID blocks, which does the job I need. So after the first byte, I have this first parent ID block. Then I have 8 bytes for a single node ID; I call this the referenced node. For normal nodes, I use this for the category chains, I write here the next node in the current chain. For transformation nodes, as they can have only a single children, I store it here. Note that transformation nodes need not be included in category chains, since if a new state is related to a state in a transformation node, then it must be related to the normal node to which the transformation node points, and this normal node will reside in a chain, so it can be found for the new state. Then, if I want to minimize the number of nodes, I can check the parents of this normal node for a transformation node which will be exactly what is needed, so I can still avoid creating multiple identical transformation nodes.

Okey, so after the referenced node, a transformation node has the transformation written into it, and that's all. A normal node has the number of moves (or equivalently, the number of child nodes), and then the IDs of the child nodes, initially containing invalid IDs until the corresponding move is expanded. For this to work, I need that the state always generates its possible moves in the same order so that I don't need to store the move itself, its index will suffice. Finally, the state itself is written to the normal node.

That's about all. This structure allows relatively straightforward implementation of everything I need. Let me just describe in terms of the stored format how I do expansion.

When the graph is requested to expand a certain child of a node, I first check if this child is not yet expanded. If the corresponding child node already exists in the graph, then this expansion request is an error. Then I read the state from the node, perform the move on it, so I obtain the state of the child node. I compute the category of the child state, and traverse the category chain, reading the state from each node in the chain to check for relations. If one is found, and it is an identical state, then this will be the child node itself. If it is a state related through a non-identity transformation, then I check the parent nodes for a transformation node with the same transformation; if one is found, then this will be the child node itself, otherwise a new transformation node is created, its child is set to the node found by the category search, and used as the child node. Finally if no related state is found, then a new normal node is created, inserted into the category chain, and used as the child node.

Either way, the child node is properly connected to the parent. Establishing this connection, and also establishing the connection between a new transformation node and its child, entails an update of the results, at least if the result of the child node is not unknown.

Also, whenever a new node is created, the corresponding node counter in the descriptor is incremented, and the new node is added to the ID list file. I might have missed something, but certainly not a lot. Except for the category head store itself, which I will get back to later.

Nincsenek megjegyzések:

Megjegyzés küldése