Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: on ceiling effects and the need for time adjustments

Author: Uri Blass

Date: 22:54:59 01/18/06

Go up one level in this thread


On January 18, 2006 at 21:46:52, Graham Banks wrote:

>On January 18, 2006 at 21:36:37, Joseph Ciarrochi wrote:
>
>>I just ran a 4/4 match of rybka beta 1 versus ruffian 2.1 and rykba won
>>13.5-1.5. It would be real hard for any later rybkas or any engine for that
>>matter to improve on this score.
>>
>>Then it occured to me: Maybe this is a core problem with our testing. Do we have
>>some ceiling effects going on? For example, both Grand masters and internal
>>masters would beat me 20 -0, and both would receive the same ELO bonus (actually
>>the GM would receive fewer points for beating me.). This happens despite the
>>fact that the GM is much better.   The same thing may happen with engine
>>tournements.
>>
>>Should we be perhaps giving a time bonus for engines below a certain strength
>>(e.g. below ruffian).? If we want to maximize our sensitivity to engine
>>differences, perhaps we should give a sufficiently long time bonus so that
>>winning percentage is closer to 50% rather than 10%.
>>
>>It would be interesting to see how fruit, fritz and rybka performan against
>>ruffian with a 50% time bonus.
>>
>>
>>what do you folks think? Would time bonuses provide more meaningful data
>>
>>best
>>Joseph
>
>
>Hi Joseph,
>
>I don't think it would provide any meaningful data to be honest. I'd have
>thought that the main purpose of engine v engine testing was to determine which
>performs better given equal conditions.
>Just my opinion,
>
>Regards, Graham.

I do not agree.

It is known that rybka performs better than other engines.

The target of testing rybka can be to find which version is better and in that
case it is better to compare by testing different rybka betas against the same
opponents and give the opponents more time relative to rybka.

I think that it may be interesting to have rating list for ponder off games when
weaker engines get more time(and of course the time that different engines get
is going to be written in the list).

The target of giving more time to weaker engines should be to have result that
is as close as possible to 50% for all engines.

I think that it is easier to imrove from 50% to 55% and not from 70% to 75% or
from 80% to 85% and we can see the value of improvement in rybka better if we
force the first version to get only 50% by giving more time to the opponents and
test next rybka versions under the same conditions.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.