Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Testmethods for n=0, n=1 and n=>800 - For Beginners and 'old Hands'

Author: Rolf Tueschen

Date: 07:38:17 09/13/02

Go up one level in this thread


On September 13, 2002 at 10:15:49, Uri Blass wrote:

>On September 13, 2002 at 09:20:26, Rolf Tueschen wrote:
>
>>Computerchess (CC) can be boring if one loses the track of chess. Then it can
>>become a mere application of programming and computer sciences. Good enough for
>>the talented but not enough for the wealth of chess. The occupation with
>>computers and the programs and plays has the same addictive aspects like motion
>>pictures and television, only that you can actively participate. With the
>>internet there is finally a whole virtual reality existing.
>>
>>Who ever became acquainted with chess must by force land in CC. With CC and the
>>chess programs the myst of human chess is gone. Even as a beginner or as master
>>you were collecting games and analyses. But no matter how hard you tried, you
>>could fill thousands of index cards, but you couldn't collect a couple of
>>millions of games. Today, what is played miles away is on your display seconds
>>later. Oversights are detected in seconds with the actual software.
>>
>>Now - it has a tradition of sports in the Anglo-American world. Closely
>>connected with the science of measuring and counting who's the best. May I
>>inform western readers that in the tradition of the ancient old East it's more
>>about personal life and personal training to get some individual perfection, no
>>matter how this differed from the perfection of others. Anyway, because it's
>>very common to build up ranking lists, the same took place in CC.
>>
>>Let's quickly compare human lists and computer rankings. The Elo method allows
>>to calculate the individual strength (performance) over the variable of age. In
>>CC programs have no age at all, because almost each new version gets completely
>>new limbs and organs so to speak. That means that you can't compare the old and
>>the new version. Or would you compare the embryo with M. Dos Savant?  We
>>remember the old saying "You can't compare apples with beans". Nevertheless CC
>>has ranking lists for decades now with the astonishing result that the newest
>>progs are on top and the oldest, on the weakest hardware, are at the bottom. Big
>>surprise!
>>
>>Now the industry wants to know if its newest "babies" were at least a little bit
>>"better" than the former version. Not that it mattered, but PR needs a minimum
>>of authenticity.
>>
>>So how would you measure "better" and how much is better? What is exactitude in
>>such a fuzzy world like chess? Chess is comparable with differential mathematics
>>because there's no 'finite' until the game has been solved. And don't forget
>>there are more chess moves than atoms in our World! Don't hold your breath that
>>chess could be solved next week. Won't happen in a lifetime.
>>
>>So, I repeat, how do you want to measure and calculate in chess? Isn't chess the
>>game with always new discoveries in almost every new game? How many games you
>>must run to know which version is stronger than its predecessor? 0, 1 or over
>>800?
>>
>>The answer is short. No matter how many games you run, or even if you'd run no
>>game at all, you get results. Here they are:
>>
>>With n=0:
>>
>>I know for _sure_ that the next version is always stronger than the former.
>>Of course 5% incertidudiness included! Take my bet?
>>
>>With n=1:
>>
>>Here we have Thorsten. Testing chess machines for two decades now since he was
>>13 or such. Of course he _knows_ what version is stronger after a game or two!
>>Again, the 5% included!
>>
>>With n=>800:
>>
>>Ed Schroder has a dozen machines and lets them play autoplayed games. Could he
>>get exact results? Not really, but probably he knows what he wants to know after
>>800 games. But - also Ed has a risk of 5% or 2%.
>>
>>Is this another chapter of the evil's black humour? No, not really! I wanted to
>>show you that even with the most sensitive statistics you can't get certitude in
>>chess. And let me tell you that statistics is simply not made for chess. This is
>>like maths in astrology instead of astronomy. Period.
>>
>>Take a 100 m final in athletics. Now either someone is visibly faster then he's
>>the best. The moment you can't decide with your own eyes who's the winner, there
>>is no winner at all no matter how many digits you are defining. As humans we
>>don't take the one runner with two nano seconds less as the "best"! We say
>>simply that they are equally strong. And that should be remembered in CC too. If
>>you get a result of 52-48 then the two progs are equally strong. And no voodoo
>>with statistics could bring more clarity. And 720 to 680 is - in chess with
>>computers - also almost equally strong. You can't get automatically "better"
>>results in CC with simply raising the n. Why? Because the whole thing with
>>statistics is the underlying distribution. Strength should be a normal
>>distribution, but it isn't in CC. In CC almost all depends on hardware. The rest
>>is so minimal that you can't detect it statistically.
>
>I disagree.
>Most of the population of chess programs is clearly weaker than the top
>programs.
>
>Gnuchess is losing against crafty even if you give gnuchess hardware that is 10
>times faster if the time control is slow enough and gnuchess is not a weak
>program but at the level of the average amateur.
>
>Uri

I agree. This was chapter one though. Seems fair enough that GNU which has no
clue about endgames, tablebases, not even GM books, and then being amateur, is
weaker than Crafty. Was GNU ever tuned on Crafty? I mean if I would take GNU as
a pro I would make at least 8th place in SSDF out of it. But actually we are
comparing apples and beans. GNU is not of "this" world now. BTW I played
SIBIRIAN, for that nice prog I promissed you the same! Implement all the modern
stuff and it will play billy bully with FRITZ, I suppose. Not even needing
tablebases. Cough.

Rolf Tueschen



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.