Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: The big drop in the rating of my Fruit personality

Author: Uri Blass

Date: 18:07:38 10/17/05

Go up one level in this thread


On October 17, 2005 at 20:58:19, Ryan B. wrote:

>On October 17, 2005 at 10:10:58, Uri Blass wrote:
>
>>I am now surprised by the big drop in the CEGT rating of my Fruit personality.
>>
>>It was already 2806 after 92 games and now it is 2748 after 223 games.
>>
>>I also remember possible error of 61 elo after 92 games but even if the real
>>rating is 61 elo lower than 2806 then I still do not expect the rating to change
>>so fast.
>>
>>This is surprising also because results that I read earlier not by CEGT
>>supported my personality.
>>
>>I wonder if the real error is not higher than the error that is written
>>
>>I wonder what is the reason for the big drop and if there was no problem in the
>>matches against spike and Jonny that seem to be the main reason for the drop in
>>my personality(did the same tester play these matches?).
>>
>>possible source of mistakes in the results.
>>
>>1)testing in different hardware relative to previous fruit.
>>
>>The claim of the CEGT is that they test with hardware that is equivalent to 2
>>ghz PIV but the problem is that there is no equivalence and it is possible that
>>one program likes more one processor and not another processor.
>>
>>2)testing different positions and not the same positions that were tested by
>>earlier version.
>>
>>3)testing against different opponents.
>>
>>Uri
>
>
>I could have told you that setting the history to 50 was not going to maintain a
>higher rating than keeping it at 70.

You could not know it and we still do not have enough games to know that 50 is
weaker than 70.

  Sure it may help in analyzing some
>positions but in game situations how often does it really help?  About 5% - 10%
>of games at most?

5-10% is significant.

  A little bit extra depth helps in every game Fruit plays

Not correct.

this little extra depth seems to be less than 0.5 ply based on test positions
and I am sure that there are games that it changes nothing.

 so
>it makes sense to sacrifice some level of error for extra search depth.


By this logic it make sense also to increase the history threshold from 70 to
higher value because it is good to sacrifice speed for extra depth.

I think that we still do not have enough data to know if 70 is better than 50
and it only has better rating than 50 but the possible error is clearly smaller
than the rating difference.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.