Author: Uri Blass
Date: 18:07:38 10/17/05
Go up one level in this thread
On October 17, 2005 at 20:58:19, Ryan B. wrote: >On October 17, 2005 at 10:10:58, Uri Blass wrote: > >>I am now surprised by the big drop in the CEGT rating of my Fruit personality. >> >>It was already 2806 after 92 games and now it is 2748 after 223 games. >> >>I also remember possible error of 61 elo after 92 games but even if the real >>rating is 61 elo lower than 2806 then I still do not expect the rating to change >>so fast. >> >>This is surprising also because results that I read earlier not by CEGT >>supported my personality. >> >>I wonder if the real error is not higher than the error that is written >> >>I wonder what is the reason for the big drop and if there was no problem in the >>matches against spike and Jonny that seem to be the main reason for the drop in >>my personality(did the same tester play these matches?). >> >>possible source of mistakes in the results. >> >>1)testing in different hardware relative to previous fruit. >> >>The claim of the CEGT is that they test with hardware that is equivalent to 2 >>ghz PIV but the problem is that there is no equivalence and it is possible that >>one program likes more one processor and not another processor. >> >>2)testing different positions and not the same positions that were tested by >>earlier version. >> >>3)testing against different opponents. >> >>Uri > > >I could have told you that setting the history to 50 was not going to maintain a >higher rating than keeping it at 70. You could not know it and we still do not have enough games to know that 50 is weaker than 70. Sure it may help in analyzing some >positions but in game situations how often does it really help? About 5% - 10% >of games at most? 5-10% is significant. A little bit extra depth helps in every game Fruit plays Not correct. this little extra depth seems to be less than 0.5 ply based on test positions and I am sure that there are games that it changes nothing. so >it makes sense to sacrifice some level of error for extra search depth. By this logic it make sense also to increase the history threshold from 70 to higher value because it is good to sacrifice speed for extra depth. I think that we still do not have enough data to know if 70 is better than 50 and it only has better rating than 50 but the possible error is clearly smaller than the rating difference. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.