Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: The big drop in the rating of my Fruit personality

Author: Uri Blass

Date: 09:00:28 10/17/05

Go up one level in this thread


On October 17, 2005 at 10:57:35, Heinz van Kempen wrote:

>On October 17, 2005 at 10:10:58, Uri Blass wrote:
>
>>I am now surprised by the big drop in the CEGT rating of my Fruit personality.
>>
>>It was already 2806 after 92 games and now it is 2748 after 223 games.
>>
>>I also remember possible error of 61 elo after 92 games but even if the real
>>rating is 61 elo lower than 2806 then I still do not expect the rating to change
>>so fast.
>>
>>This is surprising also because results that I read earlier not by CEGT
>>supported my personality.
>>
>>I wonder if the real error is not higher than the error that is written
>>
>>I wonder what is the reason for the big drop and if there was no problem in the
>>matches against spike and Jonny that seem to be the main reason for the drop in
>>my personality(did the same tester play these matches?).
>>
>>possible source of mistakes in the results.
>>
>>1)testing in different hardware relative to previous fruit.
>>
>>The claim of the CEGT is that they test with hardware that is equivalent to 2
>>ghz PIV but the problem is that there is no equivalence and it is possible that
>>one program likes more one processor and not another processor.
>>
>>2)testing different positions and not the same positions that were tested by
>>earlier version.
>>
>>3)testing against different opponents.
>>
>>Uri
>
>Hi Uri,
>
>okay we had the following....
>
>after 51 games---2823 ELO
>after 93 games---2806 ELO
>after 130 games---2760 ELO
>after 223 games---2747 ELO
>
>One thing often happens with EloStat. In the beginning you get very high ratings
>that in 90% of all cases cannot hold.
>
>General opinion of CEGT testers is that most settings do not give the same good
>results with longer time control than for Blitz. For Eccentric for example we
>canĀ“t reproduce good results completely, because we do not use a special book or
>learning. Any test suite of around 100 Blitz games posted here can only be an
>indication that it might be worth a try, but in many cases we will see again and
>again that it will not hold with longer time controls.

It is surprising if it is the case for history threshold because it was not
designed to be a blitz setting.

my tests showed that some mistakes of Fruit in CEGT and WBEC at long time
control could be prevented by reducing the threshold to 50.

Note that I did not play blitz games in order to suggest the setting and my idea
was that if this setting can prevent mistakes of fruit at long time control it
may be productive.

Initially I even thought that the setting is probably bad at blitz because
otherwise the fruit team could find it by their own testing and my intuition
says that history pruning seems something that is more dangerous at long time
control and not at blitz(note that it is also Vincent diepeveen's opinion but
this is not the reason for my thought because usually everything that vincent
says the opposite is correct).

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.