Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Standard deviations -- how many games?

Author: Rolf Tueschen
Date: 01:56:28 01/24/04
On January 23, 2004 at 21:27:43, Dann Corbit wrote:

>On January 23, 2004 at 21:08:27, Bob Durrett wrote:
>>I don't wish to muddy the waters too much but the fact is that chess-playing
>>programs or machines do not enter tournaments with zero information known about
>>them.  Just as in human tournaments, prior knowledge known prior to any games
>>being played in the tournament can be very significant.
>>
>>Consider a trivial example:  Suppose a top GM is to play a chess match against a
>>true chess beginner.  It is known apriori that the top GM is a whiz at chess and
>>the beginner is a washout.
>>
>>Will it take thirty games to determine who is better?  No, it will take ZERO
>>games.
>
>You are wrong.  It will take 30 games before we know anything about the unknown
>player.

ROFL.

Dann, you are taken away by your wrong science, believe me. Bob D. spoke of a
really new and untrained newcomer, understood? So even a Fischer or Tal after
learning the moves of the pieces played like beginners. The example was NOT
meant to be "unknown in public tournaments". Of course then a genial player
could become a real threat if he has played thousands of games against his
uncle. See Morphy for example.

It is very funny how we are misleaden if we rigidly follow our "science" and in
reality something odd happens.

The joke here is that we two discussed the SSDF and there it is NOT the case
that a _weak_ newcomer meets a strong old player, but it goes the other way
round.

In the SSDF they match a very old program with a newcomer, out of the newest
generation of progs, and we know in advance the result, because the old prog -
most of the time also on weaker hardware [NOT equal hardware!] - has no chance
at all. Here at last you should understand that comparisons with human chess are
totally useless. In human chess the old experienced player will always be
superior to a newcomer [this is what Bob D. said, but what you cannot believe
because you are coming from computerchess methinks, but it is trivial for a
chessplayer, I mean a real player who already played in human tournaments and
clubs], and if I say newcomer, then I mean an untrained newcomer who had just
learned the rules of the game.

In that case, we dont need 30 games to predict the result but we need zero games
to predict who is stronger! And for the SSDF we can say that without any games
at all we can predict that these matches between outdated progs on older
hardware and the newest generation, although these new entities are completely
new [and unexperienced what SSDF is concerned, hehe], we dont need 30 games to
predict who is stronger. It is always the new prog. And this is why it's
complete nonsense how this is practiced in the SSDF.

Now you said that these games against older progs on older hardware were
extremely important. I know. Because on these older progs the SSDF based its
validity. Because these older progs have played themselves against even older
progs and this way the chain goes back to the once held competition between long
time forgotten progs against Swedish masters. Hehe. Dann, this is all nonsense,
to be true with. It's statistically complete nonsense, it sounds as if we were
doing medical homoeopathy without one molecule left in the Oceans of
computerchess. But we are still pretending that their is a linkage to the past.
And it's true, we human beings have deep roots in our past but these machine
progs are completely determined what this is concerned. We KNOW for certain what
a new program will do with the old, out-fashioned warrior. Putty! And we can
prove it with different depths and speed.

So, my argument against SSDF runs like this. Because they are leaving the pool
of always new generations and hypostate a linkage between these new progs and
their predecessors and make unrealistic and predictable matches between old and
new the results are worthless. Because they show, what we already knew _without_
a single game. Bob D. tried to explain that.

Now what is happening. We are still talking about chess. And chess itself causes
that the outcome depends on the chance of the chess position. Sometimes you are
lucky, we chessplayers say. Now what you get with these short matches is
something about luck that spoils the naked data that is completely predictable
otherwise. And this ways SSDF gets a ranking list that looks like the new progs
itself had important differences. But if you examine the deviations of the mere
numbers the SSDF itself shows that the hypostated differences are within the
limits of normal expectance. The normal variations are the reason for the
impression of always new dufferences of basically equal entities in a pool of a
new generation. Because hardware is the main factor of today's computerchess.

You must be an experienced scientist to understand the hoax of statistical
results. But you and everybody else once learned that to keep up a good stats
you absolutely MUST keep your measured variable under control, better you must
control all the remaining variables you dont measure, because otherwise you dont
know what influenced what. This is the base of all SSDF activities. They compare
apples with beans. And they publish honestly here are beans and there we have
bananas. And beans rank on the top ranks actually. But the mind of human beings
is lazy and it reads: Ahem, SSDF published its newest results, oh, surprise,
this time Fritz lost position one aggainst Shredder, and some predict, next time
Ruffian will be new number one. How interesting.






> Consider this:
>At one time, Kasparov, Fischer, and Tal had an Elo below 2000 and were
>completely unknown.  They came out of the woodwork and started blasting the
>bejabbers out of people.  Just because we know someone is talented, does not
>mean we can use that data to extrapolate the level of talent of an unknown
>entitiy.

All correct. For human chess.



>
>>The number of games required depends on the prior knowledge about the
>>contestants.
>
>There is no connection at all.

This is again, as I said, only caused by lack of experience, that this is
thought as a correct conclusion. In real the statement by Bob D. is trivially
correct. But this is a question, which is more a pedagogic one. How to explain
it to a real knowie who has learned his stuff and thinks he knows it all.
Breaking the wall is very difficult.



>However, we will gather more and more
>information about the strength of the unknown opponent as more games are played.
> He could be weaker, stronger, or the same as the great player.  Imagine someone
>who does not play humans but has played against computers for 5 years.  He might
>be a very good player that nobody has heard of.  Of course, it is not likely
>that a player will be better than Kasparov or Anand.  But until the games are
>played, we won't know.  And 3 games against Kasparov will tell us very little.
>Even if Kasparov loses all 3 games.

For all he will lose. I bet. ;))



>
>>I hope this is not too distressful for anybody.  : )
>
>Bad science.  Using your intuition to do science is a very bad idea.


Yes. But it is even worse to forget about your own thinking, doing science.
Sorry, I didn't mean anyone here arounf personally. ;)




> It is good
>to form theories using intuition.  But it is bad to assert the truth of your
>feelings without testing.

This is true, but the testing must follow iron rules of science. As I said, the
most important is the control of the variables. If you dont you dont even know
what you are "measuring". And your results are - yes - they are worthless.

Rolf
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.