Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Elo performances of Rybka settings in my testing

Author: Vasik Rajlich

Date: 05:11:35 12/25/05

Go up one level in this thread


On December 24, 2005 at 21:43:19, Albert Silver wrote:

>On December 24, 2005 at 18:47:48, Rolf Tueschen wrote:
>
>>Perhaps a stupid question, but let me ask it for my own interest. You give a
>>couple of types with different results and then after 160 games each you get
>>different results (one time some 50 points difference) and then I've read the
>>item "ponder=off"; do you know something about the importance of that factor?
>>I'm thinking about the importance of that aspect for say FRITZ with its
>>tradition or better experience. Please dont go into all details, just give me a
>>short estimation for the importance.  Without knowing the answer, I see three
>>critical points: ponder off, 160 games and difference of 50 points between two
>>special types.
>
>Relax dude. That other part was quite unnecessary. For anyone playing with a
>single computer or single cpu/core, "ponder=off" is obligatory. What it means is
>that the engines do not think during the time of the opponent. Otherwise one
>cannot know if one engine is using the CPU more, if there is a problem in the
>equality of the usage of the CPU, etc. Nowadays, there are even dual-core
>processors (equivalent to two separate processors in one - the AMD Athlon65 X2
>series is an example), so that people can run engine matches with Ponder=On, but
>my processor is not one of them.
>
>The engines do have separate and equal-sized hash tables (256 MB for me, which
>is fine since I have 1 GB Ram total), and when it is their turn to think and
>play, they make use of the previously calculated hash tables.
>
>In theory, there should no change in Elo performance using "ponder=off". The
>conditions of the matches I reported are this:
>
>I tested 3 Rybka settings ('very positional', 'slightly positional', and
>'slightly tactical') against identical opponents (Deep Fritz 8, etc.) using
>identical openings (the Nunn2 set), in order to give each setting the exact same
>conditions, and hopefully better show any performance differences, if any.
>
>The Nunn2 set, in case you don't know, is a set of 20 opening positions chosen
>by GM John Nunn, in which 2 engines play 2 games of each position, once as white
>and then as black. So 4 opponents for each setting means 4 matches of 40 games,
>or 160 games.
>
>One can argue that the positions may favor one engine more than another, but the
>positions are designed to try and provide a variety of typical types of
>opening/middlegame problems to solve, and remember that the engines get to play
>both sides of each position.
>
>I am starting another series to test the Rybka settings against Junior 8 (don't
>have J9), and see how it does.
>
>                                      Albert

Of course there was nothing wrong with your tests.

I just want to add one thing. Technically, 160 games is not enough - your
variance is bigger than the performance difference.

In practice, though, I will often make decisions based on even less data than
this. If I like a certain change intuitively, and it jumps out to a decent lead,
I'll just keep it.

Unscientific regards,
Vas



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.