Nov 25, 2011

[Info] Testing results

So some testing to confirm I didn't do anything silly.

M1-1 is a version with 64 bit zobrist keys in the transposition table, removal of the notion of "row" and some evaluation fixes. But without the tapered eval. (see previous posts for more info)

Against the Mediocre v1.0 beta it turned out like this:


Program Elo + - Games Score Av.Op. Draws
1 M1-1 : 2401 6 6 11029 50.4 % 2399 24.5 %
2 M1B : 2399 6 6 11029 49.6 % 2401 24.5 %


So pretty much equal, which is good enough. The worst scenario here would be the beta version being slightly stronger, but that should only be at most with a few elo points.

And against some other engines just to confirm.


Program Elo + - Games Score Av.Op. Draws
1 counter : 2593 15 15 2048 76.4 % 2389 23.4 %
2 M1-1 : 2392 8 8 6154 48.2 % 2405 14.4 %
3 adam : 2337 14 15 2048 42.5 % 2389 9.3 %
4 bikjump : 2294 15 15 2048 36.6 % 2389 10.6 %

Program Elo + - Games Score Av.Op. Draws
1 counter : 2580 14 14 2048 75.1 % 2388 25.4 %
2 M1B : 2390 8 8 5854 47.5 % 2407 15.8 %
3 adam : 2343 14 14 2048 43.7 % 2388 9.3 %
4 bikjump : 2290 16 16 1748 36.4 % 2388 12.1 %

The newer version seems to be holding up.

I'll release a new version with this during the weekend, probably on Sunday.

Then I have a steady foundation to start tackling the evaluation again.

No comments: