Jan 30, 2007

[Guide] Transposition tables

After exactly three weeks since the last release it is finally time for the transposition table guide and after this the new version of Mediocre.

Beware, the transposition tables introduces a whole range of new possible bugs, as well as reinventing old ones. I for one should know after these three nightmare weeks. :)

In this guide I will go through the basic idea of a transposition table, as well as hashtables and things like replacement schemes and avoiding collisions.

What I will not cover is the memory usage issue. This is very implementation specific and quite complicated. I will be using a hashtable (an array) with 1048583 positions (2^20 + 7 to make it a prime number), it seems to work ok, but frankly I do not know how much memory it would use if filled completely with the moves being stored as objects etc.

I will most likely get back to the memory issue in the future when it is time for optimizations, but for know I will assume the hashtable is of a suitable, fixed, size.

Transposition Table basics

We start with the basics. A transposition is a position that can occur after different sequences of moves. For example:

This position can occur after:
1. d4 d5
2. c4 ...
As well as:
1. c4 d5
2. d4 ...
It makes no sense calculating this position again since we should already know what the best move is from here.

Let us say we searched 8 ply from the start position and found a line:
1. d4  d5
2. c4 c6
3. d3 e6
4. Nf3 Nc6
and evaluated it to +10.

When we get to this position after 1. c4 d5 2. d4 ... we already know that the best answer black has is c6 and the evaluation is +10.

So we just saved 5 plies of searching.

In general, opening and middle game positions do not offer that many useful transpositions and the gains are fairly limited, but in the endings, especially king and pawn endings, the number of transposition are very common and the gains from using a transposition table enormous.

But this is not all, and not even the most important factor. As mentioned before alpha-beta search relies heavily on move order. If the best move is searched first the alpha-beta can cut off many more lines, thus being much more effective.

Moves from earlier searches, that is moves in the transposition table, tend to be very good moves, even though they might not have been searched to deep enough depth in the earlier search to completely replace the search. If we search this move first every time, we have a good chance of hitting the best move much more often.

This helps more than any other move ordering improvement schemes (like killer moves or history heuristic).

Hashtables

Earlier I have been talking about transposition tables and hashtables like they are the same thing. That is not quite true though. A transposition table uses a hashtable to store the searched positions.

A hashtable can be used for many things, one common use is hashing files on the internet, that is giving them a unique key, and storing the keys in a table so we do not need the file-name (or the whole file) to find the file. The hashing is the same as our Zobrist keys (we give every position a unique key).

Storing large ranges in a smaller table

A hashtable is a way to store a large range of values into a smaller, more manageable table.

If we move 10 plies from the start position we have about 70 trillion possible positions.

As explained earlier we use Zobrist keys to represent the positions. They look like this:
2809723212531837306
6543248948113846523
8949841559963541289
As stated before they do not cover all positions, but they would by far cover the 70 trillion originated from the starting position.

So if we want to store the searched positions we might think we need a table with atleast 70 trillion positions available. And then use the index to find the position we were looking for. Like this:
Index                   Zobrist key and eval etc.
.
.
.
2809723212531837306 2809723212531837306 (+ eval etc.)
2809723212531837307 2809723212531837307
2809723212531837308 2809723212531837308
.
.
.
6543248948113846523 6543248948113846523
6543248948113846524 6543248948113846524
6543248948113846525 6543248948113846525
.
.
.
Of course if we wanted to store all positions we would have to do like this. But a table with 70 trillion places takes up a lot of space, so it is hardly possible anyway.

Now luckily enough alpha-beta (and other pruning methods) cuts off most of the positions. So we would get something like this:
Index                   Zobrist key and eval etc.
.
.
.
2809723212531837306 empty
2809723212531837307 empty
2809723212531837308 empty
.
.
.
6543248948113846523 empty
6543248948113846524 6543248948113846524
6543248948113846525 empty
.
.
.
Now we get a whole bunch of empty space instead, but the table is still huge.

Let us say we had a super effective alpha-beta algorithm and only got 100 evaluated positions after the 10 ply search.

Instead of storing those 100 Zobrist keys in a huge table with trillions of unused spots we use a hashtable.

The hashtable has much less spots, the bigger the better of course, but let us use 1000 just as an example (this is an extremely small hashtable). So it could look like this.

Index Zobrist key and eval etc.
0 5423248948113846000
1 empty
2 empty
.
.
.
500 empty
501 empty
502 1863548654542656502
.
.
.
998 empty
999 4564813548868468999
empty

Hash function

Ok, good. So we have a much smaller table and still fit all the positions we searched in it. But in what index should we store a certain position and how do we know where it is?

Index 0 for example does not tell us anything of what position is stored there.

The solution is a hash function. We apply a certain function to the Zobrist key to receive a value between 0 and 999.

That hash function is usually (you can use whatever you want, but this is very easy and logical):
[zobrist key]%[total size of the hashtable]
The % is called 'modulus' and works like division but returns an integer, cutting off any decimals the division might get.

In Java this is written something like:
int hashkey = (int)(zobrist%HASHSIZE);
We want the hashkey to be of type 'int' so we can use it for index, but the zobrist key is of type 'long' for we need to reformat it to 'int'.

So let us say we have the Zobrist key 6543248948113846523. We then apply the hash function:
6543248948113846523 % 1000 = 523
So we store that position at index 523.

Now when we want to look up our position we simply apply the hash function to our zobrist key and we get the index we need to find.

As you might have noticed, taking any big number modulus 1000 returns the last three digits, this is not true for other numbers though. The size we use for the hashtable is, as I mentioned in the beginning, determined by how much memory we can spare. The bigger hashtable the more positions we can store in it.

Of course we have an obvious problem here, the keys:

1863548654542656502 and 3549874513549875502

would hash to the same index, 502, if we had a hashtable of size 1000. This is called a collision. More about that further down.

Transposition table entries

Now we have a way of storing searched positions, so we need to figure out what information we need about the position.

The less you store the bigger your hashtable can be with the same amount of memory. For example if you, like me, store a whole Move-object at every entry, your hashtable will not be able to hold nearly as much as if you stored a string like "e2e4" or even just an integer representing the move.

Once again I will not consider the memory factor here though.

The following is a list of the most important information, there are other things like mate-values, but these are the basics:
  • Zobrist key

    As I mentioned above two different positions can hash to the same index, so we need a way to make sure we actually have the right position before we use the values in it. The simplest way of doing this is storing the zobrist key, and every time we are about to use an entry, we first compare the zobrist keys, making sure we have the position we were looking for.

    There are other ways of doing this, like just storing half the zobrist key (to save space), but the full zobrist key is the easiest and safest way.

  • Depth

    To get this right we need to do a bit of thinking.

    A depth of '0' means we stored the position based on an evaluation on a leaf node (a horizon node). A depth of '1' means we stored the position on a frontier node and so on.

    So if we run into a transposition with the stored depth '5'. We know that the evaluation attached to it is based on 5 more plies of searching.

    This is very important since if we are about to search a position to a depth of 8 ply, and at ply 3 we run into a transposition that we found in a 4 ply search, we can not use the values we have for it. This is because those values are based on a 4 ply search which is much weaker than the expected 8 ply.

    The found position would have depth '1' attached to it, since we had 1 more ply to search when we stored it in the 4 ply search.

    But in the 8 ply search our depth would be '5' when we encountered it, since we had 5 more plies to search.

    So we are not going to use the values in the hash entry, except for searching the move attached to it first. Like I mentioned above this move is usually good.

    In programming terms we would do this check before using values from the transposition table:
    if(hashentry.depth >= ply)
    // Use the values attached

  • Evaluation

    This is the evaluation we received for the position. This is pretty straight forward as long as we know what kinds of values we store.

    In Mediocre the Evaluation method returns a positive score if the side on the move is ahead and a negative score if it is behind. This means +100 means black is ahead if it is black to move when the evaluation is made.

    So if we store the evaluation just like it was received in the alpha-beta we will alternate between positive and negative scores, even if the position is always 'positive' for say white.

    This is not a problem, and I did it like this in Mediocre. As long as we know what values we store. This was one of the first major bugs I had, the evaluation alternated from negative to positive for every step in the iterative deepening. The problem was I handled the evaluation like it was always positive if white was ahead, which it of course was not.

  • Flag

    Here is another important piece of information that might not be obvious at first.

    There are three different kinds of evaluations in the alpha-beta search:

    Exact evaluation (HASH_EXACT in Mediocre), when we receive a definite evaluation, that is we searched all possible moves and received a new best move (or received an evaluation from quiescent search that was between alpha and beta).

    Beta evaluation (HASH_BETA), when we exceed beta we know the move is 'too good' and cut off the rest of the search. Since some of the search is cut off we do not know what the actual evaluation of the position was. All we know is it was atleast 'beta' or higher.

    Alpha Evaluation (HASH_ALPHA), when we do not reach up to alpha. We only know that the evaluation was not as high as alpha.

    The exact flag is applied if the evaluation was between alpha and beta.

    The beta flag is applied if the evaluation caused a cut off, i.e. exceeding beta, this is also done if the null-move causes a cut off.

    And the alpha flag is applied if the evaluation never exceeded alpha.

    I will explain how this is used further down.

  • Best move

    This is also pretty straight forward. Here we store the best move for the position. We do not always have a best move to store, in that case we leave the field empty, as long as we know this happens it is not a problem.
In Medioce I fill the hastable with an internal hashentry class looking like this:
private class Hashentry
{
public long zobrist;
public int depth;
public int flag;
public int eval;
public int ancient;
public Move move;

public Hashentry(long zobrist, int depth, int flag,
int eval, int ancient, Move move)
{
this.zobrist = zobrist;
this.depth = depth;
this.flag = flag;
this.eval = eval;
this.ancient = ancient;
this.move = move;
}
}
As you can see I also have a field called 'ancient' there. This is used to sort out old entries that are of no use anymore, I will explain how to use it below.

Replacement Schemes

If we want to store a position that already exists in the transposition table we need to make a decision. We use a replacement scheme to decide wether to keep the old entry or replace it.

There have been a lot of research on this subject, here is the short version.
  • Always replace

    Using this scheme we always replace an entry, we do not care what the old entry was holding.

    This scheme works quite ok, and is very simple to implement.

  • Replace by depth

    Here we replace if the new entry has greater depth than the old entry. The idea is that an entry with a greater depth used more time to get to the evaluation, hence keeping the entry with the greater depth will save more time in the future.

  • Deep + always

    Here we store two entries at every position, the first entry being 'replace by depth' and the second 'always replace'. So if the new entry had greater depth, we put it in the first entry. If it had smaller depth we replace the second entry without looking at it.

    Testing has shown this is the most effective scheme, it is a tad more complicated to implement though


There are also two schemes called 'old' which never replaces an entry (opposite of always replacing), and is quite useless. And 'big' which instead of looking at the depth looks at the actual size of the subtree, this takes keeping track of the subtree and is quite complicated, but it is supposed to be fast (even faster than replacing by depth).

Mediocre uses the 'replace by depth' scheme. I might implement a two level transposition table (depth + always) in the future, but for now only looking at depth works fine.

Collisions

When storing a position in Mediocre I do not check if the new entry is for the same position or another one (i.e. two different positions hashed to the same index). I simply check if the depth is greater in the new position, and if it is it replaces the old, even if it is a totally different position.

However there are ways to fill the hashtable better.

If you have two different position hashing to the same index you can store the second position in say index hashkey+1, and when you look for a position you first look at the index 'hashkey' and if another position was there you look at index hashkey+1.

You can also have more elements at the same index, this is called 'buckets'. So when looking at an index you check 'bucket' 1 (element 1) first, and then bucket 2 and so on. If all buckets get filled up you need to start replacing.

If you have too many buckets, or use to many extra keys (e.g. hashkey+100), searching for a position takes too much time since you will have to look at every possible spot before determining the position did not exist.

Four possible positions for every index is what most people use I believe.

Remember you should only have one entry for a particular position stored in the table, the buckets (or hashkey+1) are used for different positions hashed to the same index.

This way you fill the hashtable more effectively.

I will probably implement this in the future.

Ancient nodes

Since Mediocre only replaces an entry in the transposition table if its depth is greater than the old entry, the hashtable will sooner or later be filled up with deep results from old searches that probably never will occur on the board again.

There are different ways to handle this. One obvious way is to clear the transposition table after each or every couple of moves.

But by doing that you might lose transpositions that could have been used on later moves.

Instead I use an 'ancient' field in the hash entries. In the Mediocre-class I switch a 'master ancient'-boolean every time I make a move, and if an entry has a different ancient-boolean than the 'master ancient' it is automatically replaced. This way half of the time old nodes will be replaced.

This is a crude way of doing it and I will develop it further in future releases.

New way of collecting the pv

Since we now store all our positions in the transposition table we can extract the principal variation from it.

When the search is finished we lookup the position from where the search started and add the best move from the hashtable, the we make the move on the board and get the move from that position and so on until the expected number of moves have been found.

After we are done we unmake all the moves on the board and we have our principal variation.

We have to be careful that everything is right when doing this, there are some nasty bugs that can occur, like I illustrated in my last post.

Draw detection

I threw away my old crude fen-string draw detection and replaced it with a new zobrist key based detection instead.

For now it works exactly the same, but instead of generating and storing the slow fen-strings I store zobrist keys in a small hashtable, and if I detect a repetition I return 'draw' as evaluation.

This only takes care of repetitions where the position has occured once in the game history and once in the search, so repetitions that occur only in the game tree are not considered.

I will improve this in future releases.

Moving on, finally!

I can finally move on to more interesting things. Like improved evaluation, and take alook at internal iterative deepening, and futility pruning (again). Also a in check extension is sorely needed. We will see where I start.

4 comments:

Anonymous said...

About PV extraction: I used once the method You mentioned to extract the PV from the transposition table.
It works fine until You change the replacement schemes. So be careful: You may get into a situation where the transposition table contains no more the requested table setup (because it has been replaced by a newer item mapped to the same key). If this happens your PV gets shorter then expected or worse: your program may halt if not handled carefully. This is a "feature" of transposition tables which took me 2 weeks to uncover. So be warned :)

MikKi

Jonatan Pettersson said...

Yeah I have noticed this (and pretty much any other bug you can think of :).

Right now I think the pv extraction works as it should, but I've encountered pv being shorter than expected and even 0 length, crashing the program.

I have not figured out a good way to completely avoid this yet, but I know where to look first if something strange happens. :)

Good to hear I'm not the only one having trouble with my transposition tables. :)

Anonymous said...

Bugs and transposition tables are one and the same :) My othelo engine is really tough but when I added the transposition tables first it become quite fast and stupid... I'm still nut sure how could I make it work because I still have one thing I don't understand (I use the replace if deeper scheme): for exact values I need to set the depth to be greater by one than it actually is (otherwise they are ignored). It happened nearly a year ago and I still don't get the idea what's wrong :) Nevertheless this way the engine seems to much tougher than it was before...

George said...

Hi,
I've been wondering about this for a while: When using Zero-Window search algorithms such as MTD(f), how does one go about storing the best move in the transposition table? From my understanding, only one move will ever be looked at in any node in a zero-window search (maybe this is where I'm wrong, but I haven't for the life of me been able to find a proper explanation of how zero-window searches work, everyone always just talks about how they're more efficient and return bounds instead of accurate scores - if you know of such an explanation for newbies, I'd be very grateful to read it). So how could the algorithm ever know which is the best move if it only looks at a single one? Right now I'm looking at Aske Plaat's MTD(f) documentation, and he mentions storing moves but doesn't include it in his pseudocode (which I'm more or less shamelessly copying because I don't quite understand how and why 0-window searches work)