Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How to setup fair and reasonable enginematches?

Author: Shaun Brewer

Date: 03:15:06 07/26/02

Go up one level in this thread


On July 25, 2002 at 14:40:06, Joachim Rang wrote:

>I'm going to make some automatches between amateur engines. As this is the first
>time I try to make this, I'd be glad to get some advices.
>
>I will use a PIII-Notebook at 1.1 GHz with 256 MB RAM. I plan to make not a
>tournament but one to one matches with much games (a few hundreds) to get
>significant results.
>
>What are fair and reasonable testconditions?

Try to reduce the number of background activities. On one machine I run Norton
System Works which seems to ruin results. So I don't use this for matches.

>
>I will run the match on PIII 1.1 GHz and 256 MB RAM under WinXP. I think its
>reasonable to give every engine 80 MB Hashtablesize. I still don't know which
>GUI I shall use. I tested Arena and it worked, but with some problems. With
>Winboard I don't know how to make automatches, maybe someone can help?
>
>First engine will be Yace 0.99.56, which I consider the strongest amateurengine
>today.
>Which engine can compete with Yace?
>

Crafty - I have so far found 17.10 to be stronger than 18.15

Tests on Tbird Athlon 850 (120+2)

Individual statistics:

(1) wcrafty-17.10             : 500 (+166,=246,- 88), 57.8 %
(2) Crafty-18.15              : 500 (+ 88,=246,-166), 42.2 %

    Program                            Score     %    Av.Op.  Elo    +   -
Draws

  1 wcrafty-17.10                  : 289.0/500  57.8   2373   2427   27  19
49.2 %
  2 Crafty-18.15                   : 211.0/500  42.2   2427   2373   19  27
49.2 %

Tests on Tbird Athlon 850 (300+5)

Individual statistics:

(1) wcrafty-17.10             : 500 (+159,=231,-110), 54.9 %
(2) Crafty-18.15              : 500 (+110,=231,-159), 45.1 %

    Program                            Score     %    Av.Op.  Elo    +   -
Draws

  1 wcrafty-17.10                  : 274.5/500  54.9   2383   2417   29  20
46.2 %
  2 Crafty-18.15                   : 225.5/500  45.1   2417   2383   20  29
46.2 %

Tests on Tbird Athlon 850 (900+15)

Still running something like +100 -40 =100

Tests on PIII 1332mhz (1800+30)

Still running (when machine not in use) too few games to be conclusive

Individual statistics:

(1) wcrafty-17.10             :  52 (+ 23,= 20,-  9), 63.5 %
(2) Crafty-18.15              :  52 (+  9,= 20,- 23), 36.5 %


>I think I will run the games with 10 minutes and 10 s per move for each engine.
>Is this to short?

No correct answer to this one.

Some engines may be poor at blitz relative to their standard performance, the
reverse or consistent.

The longer the time control the better the quality of the games but a match to
provide statistially relevent results could take months.

>
>Another question is, which opening books I shall use. Yace has an own book as
>most of the other engines. But I don't know how good they are, SOS for example
>comes only with a very little own book, which will be a handicap. Does someone
>know a "neutral" but large book, which will run with the most engines?

For your experiment I would use the book provided/recomended by the author
initially. If you then test with a different book and this produces better
results share your findings with the author.

Have fun

Shaun

>
>So far today, I'd appreciate any suggestions from you.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.