Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: The Validity of CC Testresults - Take my Word for that one!

Author: Rolf Tueschen
Date: 13:43:23 01/20/06
On January 20, 2006 at 16:36:54, Günther Simon wrote:

>On January 20, 2006 at 16:23:38, Rolf Tueschen wrote:
>
>>On January 20, 2006 at 14:22:06, Günther Simon wrote:
>>
>>>On January 20, 2006 at 13:56:49, Rolf Tueschen wrote:
>>>
>>>>On January 20, 2006 at 13:49:10, Günther Simon wrote:
>>>>
>>>>>On January 20, 2006 at 13:41:20, Rolf Tueschen wrote:
>>>>>
>>>>>>On January 20, 2006 at 11:51:48, Uri Blass wrote:
>>>>>>
>>>>>>>On January 20, 2006 at 05:28:47, Rolf Tueschen wrote:
>>>>>>>
>>>>>>>>On January 20, 2006 at 04:58:11, enrico carrisco wrote:
>>>>>>>>
>>>>>>>>>On January 20, 2006 at 03:14:09, Mike Byrne wrote:
>>>>>>>>>
>>>>>>>>>>http://www.chessolympiad-torino2006.org/eng/index.php?cav=1&dettaglio=309
>>>>>>>>>>
>>>>>>>>>>good stuff...
>>>>>>>>>
>>>>>>>>>Yea -- he even cited the "Anti-computer chess expert" Pablo Ignacio Restrepo.
>>>>>>>>>What more would we need?
>>>>>>>>>
>>>>>>>>>-elc.
>>>>>>>>
>>>>>>>> Yes, this, and then also the point that not automatically everything which is
>>>>>>>>quoted by a GM, here GM Golubev, is similar to Newton's Gravitation Law Paper or
>>>>>>>>Einstein's paper on Relativity. It's a bogus more or less. I want to add a
>>>>>>>>single item so that my opinion doesnt look like a cheap arbitrariness.
>>>>>>>>
>>>>>>>>The CEGT test guys are mentioned (I think some 15 persons) and it sounds as if
>>>>>>>>they were a sort of institution for certain questions in CC. Comparable to what
>>>>>>>>we meant when we spoke of "the new SSDF list" in the 90's. The problem begins if
>>>>>>>>I question that Rybka is already proven the strongest engine today. Then people
>>>>>>>>tell me to look at CEGT where that has been proven... This was a few days ago
>>>>>>>>here in CCC. I must object to such sort of hybris. The truth is that we dont
>>>>>>>>have statistical methods for making such claims. Even after 700 or maybe over
>>>>>>>>1000 games the significance is not so sure and if you look at the +/- boundaries
>>>>>>>>of the so called Elo results then you still have overlappings and you cant say
>>>>>>>>that Rybka is the clear first. - Nothing against the testers of CEGT. The
>>>>>>>>presentation of the results is nice. The games download is also well organised.
>>>>>>>>But all that can't hide the fact that we have certain statistical requirements
>>>>>>>>which must be respected if one wanted to make clear statements. We are all too
>>>>>>>>human. In a world of huge uncertainties and big problems overall, we feel the
>>>>>>>>need to do something for our wellness in such a hobby. Where if not there could
>>>>>>>>we find our peace of mind? We can test. We can create a whole network of
>>>>>>>>testers. But if we then want to make clear statements, alas, we are all standing
>>>>>>>>under the steel hard laws of stats. And basically we cant get what we want to
>>>>>>>>have. We are bound to believe in our private preferences. We can also assume
>>>>>>>>that actually, for a short time, Rybka is "certainly" looking like a very strong
>>>>>>>>engine. But everything above that would be bogus. We should all keep that in
>>>>>>>>mind. The development in CC is always moving. THere is no such thing as the best
>>>>>>>>alltime engine for the next 10 years. If I would get the newest super computers
>>>>>>>>of the US military, it could well be that I become the next World Champion with
>>>>>>>>Gullydeckel, to give an absurd example, or with my personal shooting star The
>>>>>>>>Roaring Thunder which was developed in my kitchen for the next WCCC in Torino...
>>>>>>>>I degress a little bit.
>>>>>>>
>>>>>>>Here are the CEGT single processor results
>>>>>>>
>>>>>>>I ignore single processor result
>>>>>>
>>>>>>It striked me with a sort of importunateness when I read today the campaign by
>>>>>>Simon/Pittlik? and Lagershausen and when I read your lecture here, dear Uri, I'm
>>>>>>quite sure that it's impossible to tell people the complex truth, if they are
>>>>>>used to believe in simple truths. I have learned long enough how careful one
>>>>>>should be in statistics. Honestly Uri, what you are doing here is unallowed. You
>>>>>>cant take a list with results and then simply remove certain entries and THEN
>>>>>>compare with their results included. That is your first crass mistake. Of course
>>>>>>also I do know that you cant simply compare 1-processor with 2-processor progs.
>>>>>>And that wasnt at all what I was trying to do.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>You can see that single processor programs have less than 2800 when even the 32
>>>>>>>bit version of rybka has bigger rating than 2815 when the top 64 bit version
>>>>>>>even has more than 2850.
>>>>>>>
>>>>>>>No over lapping
>>>>>>>
>>>>>>>1 Rybka 1.01 Beta 9 64-bit opt 2921 73 68 71 80.3 % 2677 33.8 %
>>>>>>>2 Rybka 1.0 Beta 64-bit 2859 21 21 765 68.4 % 2725 32.7 %
>>>>>>>4 Rybka 1.0 Beta 32-bit 2825 10 10 3575 68.9 % 2687 31.0 %
>>>>>>>6 Fruit 2.2.1 2786 8 8 5035 66.0 % 2671 33.1 %
>>>>>>>7 Fritz 9 2782 11 11 2724 62.8 % 2691 30.2 %
>>>>>>>9 TogaII 1.1a 2772 14 14 1560 60.3 % 2699 36.3 %
>>>>>>>10 Hiarcs 10 Hypermodern 2771 22 22 644 53.3 % 2749 35.7 %
>>>>>>>
>>>>>>>The only entry of CEGT that in theory can have more than 2800 on one cpu is deep
>>>>>>>fritz8 but deep fritz8 2 cpu has less than 2800 and it is illogical to expect
>>>>>>>deep fritz8 on one cpu more than it
>>>>>>>
>>>>>>>8 Deep Fritz 8 2CPU 512MB 2772 14 14
>>>>>>>15 Deep Fritz 8 1CPU 2754 107 104
>>>>>>>
>>>>>>>The fact that in part of the other lists rybka is number 1 without an advantage
>>>>>>>that is significant enough probably also increase the certainty that rybka is
>>>>>>>the best engine because the probability of something that is not the best to get
>>>>>>>first place in every serious list is very small.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>Let's come here to the second crass mistake in your arguments. You see the
>>>>>>result of first place for Rybka like I do that and you conclude that this must
>>>>>>have a proof signal as such. That is the mistake already. Because you conclude
>>>>>>that place one means best strength as such. NB that with stats you measure and
>>>>>>then you claim that your measurement has a validity. Because you kept everything
>>>>>>of importance under control. I simply object that this is wrong for the actual
>>>>>>situation because - as I have already debated with Bob Hyatt - Rybka is in the
>>>>>>initiative actually while all others must react now or tomorrow. But what the
>>>>>>results show is the improments of Rybka against unchanged older progs. And I
>>>>>>claim, without great risks, that any strong program will get in advantage, if
>>>>>>the others couldnt react yet.
>>>>>
>>>>>Rating lists don't show ratings of the future versions. I doubt Bob discussed
>>>>>astrology with you. The thread is about today not about future strength,
>>>>>no idea why you changed the topic. Ah wait I know why you changed it ;)
>>>>>
>>>>>Guenther
>>>>
>>>>
>>>>Just relax please. I dont speak of the future. I speak of the factor you didnt
>>>>reflect and couldnt control with the actual testing. Never heard about the
>>>>existing advantage of a new entry? This is not about rocket science, you could
>>>>well follow the debate if you could forget for a moment that you wanted to flame
>>>>me... just give truth a chance. I'm wrong often enough, then you can jump on me,
>>>>but this here is so trivial that you lose the debate big time.
>>>
>>>Your little earth hole gets smaller and smaller - big time ;-)
>>>Computerchess rating lists also don't measure 'new entry psychology'.
>>>Programs don't care for your psychology...
>>>Every new entry would have been number 1, if it had any significant
>>>influence, which is wrong, no CCC science needed.
>>>Have fun to work out a 'new entry' formula together with your
>>>rating program.
>>
>>
>>I have a little question for your email address: are you volker pittlik? Because
>>I never before talked to a Günter Simon. :)
>
>May be you as a computer/internet illiterate are not able to
>decide between web hosting addresses and e-mail addresses?
>
>My name is Günther Simon - I forgive you omitting that silent 'h' -
>and my e-mail is g.simon.rgbg*AT*t-online.de, which is a real
>commercial providers address. Any further questions? ;)
>If you wouldn't be as illiterate you could have done a simple google
>search for my name + chess XOR computer chess, or anything similar.
>It would have also helped to look up a certain other specific high level
>computer chess forum...
>
>
>G.S.


I know that I know nothing, but at least I have the guts to ask questions
without insulting people. As far as you mention rgsbg we will certainly meet one
sunny day since I am practically from nbrg. :)

Till then have a good time here in CCC, I hope you enjoy the topics even if you
couldnt flame me. So, let's enjoy CC united.

R.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.