Feb 11, 2007

[Guide] The memory in the hash tables

Having optimized some more, Mediocre is now searching much faster than the previous version while keeping the node count for each ply intact. Meaning it generates the same moves for a certain fixed ply, only much faster. Just like my goal was.

For the first time Mediocre now searches so many nodes in a (short) game that the transposition table fills up, this was never an issue in the old versions.

How much memory do we use

To be honest I had no idea how much memory the transposition table would take when filled. So some counting is needed.

A hash entry in Mediocre consists of:
long zobrist       -> 64 bits
int depth -> 32 bits
int flag -> 32 bits
int eval -> 32 bits
int ancient -> 32 bits
Move move
int pieceMoving -> 32 bits
int fromIndex -> 32 bits
int toIndex -> 32 bits
int capture -> 32 bits
int moveType -> 32 bits
int[] prevpos
int en passant -> 32 bits
int wcastling -> 32 bits
int bcastling -> 32 bits
int halfmoves -> 32 bits

Total -> 480 bits
And if we have an entry at each slot:
Slots (2^20)       -> 1048576
2^20 * 480 -> 503316480 bits
Total -> 60 mb
So at full load the transposition table with 2^20 slots (like Mediocre has) would take 60mb of memory.

Similarly the repetition detection table (with 2^16 slots) takes 3.75mb memory.

The standard max memory allocation for a Java application (that is without setting a -xmx at startup) is 64mb. So as you can see when adding some basic memory usage for the application itself we exceed the memory.

This resulted in severe slowdowns at the end of the games and Mediocre crashing due to heap size being exceeded afterwards.

To resolve this issue the easy way I simply scaled the transposition down one 'size' to 2^19. This makes the transposition table take less than 32mb and we will never be close to exceeding the total (64mb) memory.

Optimizing the memory usage

While we can solve the issue like mentioned above it is of course not optimal. A larger transposition table means more stored positions and a faster search.

We can use the -xms and -xmx arguments when starting Mediocre, this sets the minimum and maximum memory usage the application gets allocated. This would allow us to use the larger transposition table, but that has really nothing to do with Mediocre itself but rather the hardware it is running on (if we have 2gb free RAM to use, we might as well use it).

What is more important is how we use the memory we get allocated.

As I said above one hash entry currently takes 480 bits of memory. This is by far too much.

This is what I plan for one hash entry:
long zobrist                -> 64 bits
int move -> 32 bits
int depth/flag/eval/ancient -> 32 bits

Total -> 128 bits
And if we have an entry at each slot:
Slots (2^20)                -> 1048576
2^20 * 128 -> 134217728 bits
Total -> 16 mb
Now we only use 16mb for the same table that took 64mb before. If we go up one 'size' to 2^21 slots we use 32mb and 2^22 uses 64mb.

So a bigger table is possible for the same amount of memory.

I explained the move integer in an earlier post, the same routine will be used for the 'depth/flag/eval/ancient' integer. That is certain bits will be used to represent certain information. For instance I will use 18 bits to represent the evaluation (18 bits can hold a number up to 262143 which can hold -100000/100000 which a mate value is worth).

I could reduce the memory usage even more by only using half the zobrist as lock (this is what we use to make sure we have the right position). This would make the zobrist key an integer (32 bits) instead and a hash entry would then take 96 bits instead of 128.

However, as I have mentioned a few times already the Zobrist keys are not entirely unique. There is a risk that two Zobrist keys matches two different positions and if we only use half the key that risk increases.

While the risk is still very small, and even if we accidently use a faulty score somewhere in the search it does not nescessarily mean disaster, it is still a risk and you would have to weigh the risk against better memory usage. I decided to use the full Zobrist.

What is left for the next version

I have not yet started to implement the new move representation, which in turn is needed for the new transposition table entry representation. On the other hand I am pretty much done will all the preparations for a smooth transition.

Hopefully I will avoid all the nasty bugs I got last time around and the new version of Mediocre should soon be here. :)

17 bits will not be enough for evaluation since we need negative values as well. 18 bits is needed.


Genorb said...

I cannot wait for the new version ;) (Mediocre addicted, should see my Doctor :) ).

Jonatan Pettersson said...

Me too. :) I hope I can pull this off without any annoying bugs.

There are still many optimizations to make of course, but this should take Mediocre to new levels.