Jan 21, 2009

[Other] Test tourney for Mediocre v0.34 (in development)

After hitting a dead end with the new Board-class and getting nothing but confused with changes to the evaluation I decided to go back to the old Board-class and implement all the obvious bug fixes I found (excluding the one mentioned in my previous post of course).

This turned out to be a good move it seems as this test tournament shows.

Engine Score Me
01: Mediocre 243,0/400 ····················
02: Hamsters 19,5/20 111111111111=1111111
03: Hermann 16,5/20 1=1=11=1011111111011
04: Diablo 15,5/20 1=111111111=11=11000
05: LittleThought 14,5/20 0111111111011010=110
06: NanoSzachy 13,0/20 1=00=111011001101111
07: Counter 12,5/20 11==0001=01=11==111=
07: Gaia 12,5/20 1=11=10110001=001111
09: AliUCI 10,5/20 111010000=0101110110
10: Feuerstein 8,0/20 =00000011101=0100011
10: GreKo 8,0/20 =111000=010000101100
12: Amundsen 7,0/20 10010000110101100000
13: Bison 5,5/20 000001001010=0100010
14: Gibbon 4,5/20 0100010000001100000=
15: Clarabit 3,5/20 100=00000000000=00=1
16: Lime 2,5/20 =0000000001000010000
17: BBChess 1,5/20 000000000010000=0000
18: Bikjump 1,0/20 00000000001000000000
19: FluxII 0,5/20 000000000000000000=0
19: Vicki 0,5/20 =0000000000000000000
21: Roce 0,0/20 00000000000000000000
As mentioned earlier some of these engines are acting up quite a bit, for example does BBChess seem to only use about 10 seconds of the minute for any length of game and FluxII loses due to illegal moves now and then.

But the important part is that they did that in previous test tournaments as well so the results should be valid when comparing the strength of Mediocre versions.

Mediocre v0.334 scored 200.5/400 which is 50%, and this result (243/400) is 60% which gives a 70 rating point difference (I believe).

Not sure how accurate this is, especially since some engines gives the victories away, but it should prove that the new version is clearly better, I hope.


Germán said...

These engines used were because has a similar strength to Mediocre's ?

Jonatan Pettersson said...

Yes, I tried to pick a few above and a few below Mediocre's strength as well.

I believe this is the best way to spot most potential weaknesses.