This turned out to be a good move it seems as this test tournament shows.
As mentioned earlier some of these engines are acting up quite a bit, for example does BBChess seem to only use about 10 seconds of the minute for any length of game and FluxII loses due to illegal moves now and then.
Engine Score Me
01: Mediocre 243,0/400 ····················
02: Hamsters 19,5/20 111111111111=1111111
03: Hermann 16,5/20 1=1=11=1011111111011
04: Diablo 15,5/20 1=111111111=11=11000
05: LittleThought 14,5/20 0111111111011010=110
06: NanoSzachy 13,0/20 1=00=111011001101111
07: Counter 12,5/20 11==0001=01=11==111=
07: Gaia 12,5/20 1=11=10110001=001111
09: AliUCI 10,5/20 111010000=0101110110
10: Feuerstein 8,0/20 =00000011101=0100011
10: GreKo 8,0/20 =111000=010000101100
12: Amundsen 7,0/20 10010000110101100000
13: Bison 5,5/20 000001001010=0100010
14: Gibbon 4,5/20 0100010000001100000=
15: Clarabit 3,5/20 100=00000000000=00=1
16: Lime 2,5/20 =0000000001000010000
17: BBChess 1,5/20 000000000010000=0000
18: Bikjump 1,0/20 00000000001000000000
19: FluxII 0,5/20 000000000000000000=0
19: Vicki 0,5/20 =0000000000000000000
21: Roce 0,0/20 00000000000000000000
But the important part is that they did that in previous test tournaments as well so the results should be valid when comparing the strength of Mediocre versions.
Mediocre v0.334 scored 200.5/400 which is 50%, and this result (243/400) is 60% which gives a 70 rating point difference (I believe).
Not sure how accurate this is, especially since some engines gives the victories away, but it should prove that the new version is clearly better, I hope.
2 comments:
These engines used were because has a similar strength to Mediocre's ?
Yes, I tried to pick a few above and a few below Mediocre's strength as well.
I believe this is the best way to spot most potential weaknesses.
Post a Comment