Mediocre Chess: [Info] Time for more testing

Oct 14, 2011

[Info] Time for more testing

Well, not exactly more testing but need to work out a better procedure.

Currently I'm running the tests in Arena. One game at a time (on my quad processor...).

The games are timed at 10sec+0.1sec. With that 1000 games take about 9 hours. Which makes me barely miss the finish before going to work in the morning.

I wanted to use cutechess-cli, which is both faster and enables me to run four games at a time (one for each core of my processor), but I'm having severe problems with the engines timing out, I wonder if it has to do with Java. It's open source so maybe I can do something about it, we'll see.

Anyway, since I have two time windows where I can run tests, 8 hours at night, and 8 hours during the day (while at work), I should probably figure something out to fit that.

Perhaps have a self-play match first, then run a gauntlet if it turns out a new version is better.

Today's test looked like this:

Rank Name                 Elo    +    - games score draws
   1 Gaviota-win64-0.84   334   34   30   606   93%    6%
   2 Mediocre 1.0 -2      -60   21   20   600   43%   21%
   3 Mediocre 1.0 -1     -112   20   20   613   35%   22%
   4 Mediocre v0.34      -162   21   21   607   28%   19%

Mediocre -2 is the one with futility pruning, -1 also has lazy eval included. Not so successful it seems.

And Gaviota is beating the living crap out of all versions. Won't allow that for too long. :)

I'll get back on the new testing setup, time to get that stupid cutechess-cli to work somehow.

Oct 14, 2011

[Info] Time for more testing

No comments: