Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Gandalf H, First Impressions

Author: Dann Corbit

Date: 17:12:42 03/06/01

Go up one level in this thread


On March 06, 2001 at 19:21:47, Fernando Villegas wrote:

>On March 06, 2001 at 15:27:35, Dann Corbit wrote:
>
>>On March 06, 2001 at 14:50:03, Fernando Villegas wrote:
>>
>>>Hi:
>>>You are right that is not convincing enough in scientific terms, but it never
>>>tried to be more than a opinion, I think grounded, but opinion after all. BTW, I
>>>did play the same opennings with every version, but besides that, even with
>>>different opennings, you can more or less feel the degree of agresiveness a
>>>program has according the kind of moves he choose between the pool of acceptable
>>>moves. In this sense H seems to me more interesting. I am sure that you can
>>>feel, for instance, the different style between, say, Genius 5 and CSTAl no
>>>matter what openning you play. This is, of course, a very extreme example, but
>>>makes the point...  until a degree.
>>
>>If we are talking about impressions then I don't think there is really anything
>>to argue about.  In fact, I can form an impression from just looking at the box.
>>;-)  Since an impression is nothing more than a viewpoint it is neither right
>>nor wrong.  It is an impression.
>
>
>Maybe you go too far. There are all kinds of impressions. If we understand the
>term in the less valuable sense, as it seems you do, it is just a casual,
>cursory judgement without any fact behind and without a decent judgement of the
>facts. But in the opposite side what we have is not something altoguether
>different to impressions, a kind of absolute objetivity, but just the sensorial
>impressions scientists gets with his trained eyes, with or without instruments.
>They have a method, I know, but also non scientific way of looking at things has
>one and is called common sense, very useful many times. Just visual Impressions
>and not vectorial calculus is more than enough for driving a car.  You get
>results, you get home in safety  and nobody could tell you "you have drove the
>car only with impressions...so you did badly". There are lot of realms of the
>world where we do not need an exhaustive scientific method to grasp the truth
>and for doing the correct thing to do. At least in some areas. In this case we
>cannot, without statistics, make accurate judgements about programs, about which
>is the best, etc, but we can grasp with just the eyes some generalities about
>how they play that are enough fair to be shared. Impressions are just the way we
>perceive the world, after all. And always they are made from a point of view. We
>cannot avoid points of view and views with them and neither are neccesarily just
>trash. So when I am talking of my impressions respect a program, You cannot
>assume that they are something so unuseful as looking at the box. It can be a
>decently correct impression or at least a possible fair one, with some data to
>support it. Of course it can be wrong, but then, if it is wrong, it could be
>right. In other words, it is not beyon the realm of right or wrong. Not
>organized, not systematic, not enough data, yes, but even so inside the axis of
>truth-falsehood.

Sometimes you know you are probably inside the cone of reality with your
judgements.  The difficult part is to say where you are or if (in reality) you
are just outside the fuzzy edge.  Since the judgement is so difficult to make in
the first place, we might as well make it as easy as possible [if we plan to
make numerical estimates].

If it is just intended to be subjective then it really does not matter.  For
example:
"Golem is strong."
This is a true statement.  A "pre-columbus" ELO player will have a tough time
with it.

"Golem has an ELO of 2300 or better"
This statement may be true or false.  What version of Golem?  Against what pool
of competition?  If the opponents are ten year olds who have only played chess
for one week, then the ELO figure might be 2800.

Let's suppose I try to measure the strength.  I play 50 Golem games and change
the version.  Then I try an opening book.  But I don't like the new version and
I change back.  Then I notice something funny in the source and change and
recompile.  So I run that version for the next 97 games.  Then I make another
change and try 150 games like that.  Now, the second change, I decide I don't
like it and put it back the way it was, and play a further 99 games.  Then I add
a bit of random jitter so it won't be quite as predictable and run 50 games.
The program plays dumb moves, so I put it back for 45 games.  Then I put in a
smarter random jitter and round out 1000 games.  When finished, I announce that
GOLEM has an ELO of 1985 +29/-31.

If you presented that evidence to me, I would hold the numbers in very low
regard.

Let's try another example.

A GM watches two games an unknown computer program plays.  He likes what he sees
and announces that the program is incredibly talented.  He estimates the ELO at
2300.  Then, he watches another 50 games and downgraded it considerably to 2000.

Now, I like the second number better than the first.  Even so, I would not
consider it an accurate figure, but an estimate.

Now, suppose that we have 100 GM's play the same program ten games each, without
reprogramming or changing the books.  If the program learns, then it learns.  If
the GM's notice something, then they notice something.  When we are done, we
compute an ELO of 1875 +/- 35 ELO.

Personally, I trust this estimate the best of all.  The reason is that we are
only trying to measure one thing at a time -- or at least hold all other things
fairly constant.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.