Oct 17, 2011

[Test] Rough day

After Saturday's test gauntlet with v0.34 I ran another set with the newest new version (futility pruning being the latest big addition).

And I was completely dumbfounded by the result..
Rank Name              Elo    +    - games score draws
1 Counter 1.2 265 34 33 297 70% 22%
2 Knightx 1.92 140 34 33 297 53% 9%
3 iCE 0.2 124 33 33 297 51% 10%
4 Mediocre v0.34 119 19 19 1107 65% 7%
5 Mediocre 1.0 -2 112 15 15 1857 64% 9%
6 Horizon 4.4 64 34 34 296 44% 7%
7 TJchess 1.01 -8 35 36 295 35% 5%
8 Roce 0.0390 -47 34 36 295 30% 13%
9 Lime 66 -58 36 37 297 30% 5%
10 Adam 3.3 -171 40 43 297 19% 5%
11 Wing 2.0a -252 45 51 296 13% 3%
12 Bikjump 2.01 -290 48 55 297 10% 4%

Mediocre v0.34 back ahead of 1.0?? Having played literally thousands of games the last week I haven't seen any sign of this..

The new version was beating the crap out of 0.34, at 60+% win ratio.

So I ran a few more test sets and v0.34 were neck and neck with any of the newer versions.

Chess programming can be really hard on your self esteem. :)

Re-running a huge (by my standards) gauntlet again, including both v0.34 and v1.0 and we'll see what happens.

I almost suspect I'm using some weird compile or something, but this seems to be the new truth. :(

2 comments:

Ilari Pihlajisto said...

This is why one should use a version control system. Then you could go back step by step (commit by commit) to find out where the regression happened.

Anyway, why I really decided to post here... I'm one of the two developers of Cute Chess, and I noticed that you had some trouble getting cutechess-cli to work with Mediocre. I ran a few games with Mediocre 0.34 on Linux, and it seemed to work fine. You mentioned that engines are timing out. Do you mean that they lose on time, or do they fail to respond to ping, stall, crash etc.?

Cutechess-cli version 0.4.0 had a lot of these timeout problems due to buggy inter-process communication. In the latest version (0.4.2) the bug is fixed. So my first advice is to make sure you have the latest version.

If the engines are losing on time you can try using the "timemargin" option to allow the engines to go over the time limit by a bit. If the problem is that Mediocre becomes unresponsive, then I suggest running cutechess-cli with the "-debug" option to see where exactly things go wrong. I also recommend implementing the "ping" Winboard feature, it really helps to keep the interface and the engine synchronized. Without the "ping" feature cutechess-cli may not know when it's safe to start a new game after the previous one. The "-wait" option can help to give your engine time to prepare for a new game.

If you've got questions, requests, etc. you can send e-mail to our mailing list or to me, and I'll be glad to assist.

Regards,
Ilari Pihlajisto

Jonatan Pettersson said...

I'm using 0.4.2, so seems likely the problem is on my side.

There seem to be some unnoticed peculiarities in Mediocre that are causing problems. Since I'm comparing to Arena (that is extremely lenient with protocols) it was quite a step to cutechess-cli that shuts down the second something is wrong (leaving eight java processes lying around, since I'm using concurrency 4).

I'll try your advice and see if I can make it work. The times I've gotten it to work it's been an awesome tool, so thank you so much for it.