Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: SSDF Fritz 6 K6-2 - Shredder 2 P200MMX game 7-11/40 Now: 9,5 - 1,5

Author: blass uri

Date: 00:57:44 02/24/00

Go up one level in this thread


On February 23, 2000 at 23:50:44, Andrew Dados wrote:

>On February 23, 2000 at 17:49:20, Bruce Moreland wrote:
>
>>On February 23, 2000 at 15:01:14, Bertil Eklund wrote:
>>
>>>On February 23, 2000 at 12:33:50, Bruce Moreland wrote:
>>>
>>>>On February 23, 2000 at 11:08:43, blass uri wrote:
>>>>
>>>>>shredder2 was not tested on the fast hardware because the ssdf always use fast
>>>>>hardware for new programs and old hardware for old programs.
>>>>
>>>>Has anyone considered that this might be a major source of error, perhaps rating
>>>>inflation?
>>>
>>>Why? Any suggestions of what to do instead. Do you think humans should refuse to
>>>play opponents rated 200 elo higher or lower.
>>
>>There is a very major assumption buried in the Swedish list, the assumption that
>>these ratings have some correlation with ratings on the human list.
>>
>>The very best way to make the ratings correlate with the human list would be to
>>have the programs play against a variety of humans.
>>
>>Instead the games are played exclusively between machines.
>>
>>If you speed up a program's hardware, the program will become stronger against
>>other computers, you have ample evidence of this.  But it is not a foregone
>>conclusion that the program will become the same amount stronger against humans.
>>
>>The programs don't differ that much from each other, and it is possible that
>>when you increase hardware, you allow the faster player to superset the slower
>>one.  It sees the same stuff, just better and faster.  What is the result of
>>this?  I don't know, but it is possible that it is more extreme than should be
>>expected.
>
>I can see some evidence supporting your point.
> When playing new version of my proggy versus old one, I usually run
>'nunn-style' match over set of 20-40 positions (which gives 40-80 games). Then
>newer version makes use of new implemented knowledge which older does not have.
>Last match like that ended in score suggesting about 150 rating points strength
>difference between versions. Yet my programs ICC  average rating didn't move
>much if at all...
>
>-Andrew-

I suspect that the nunn match inflate the difference in rating.

Did you try a similiar match from fixed position from practical games of your
program(after 10 moves)?

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.