Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: mclane's summer-tournament: round 6 update

Author: Thorsten Czub

Date: 15:54:20 09/06/98

Go up one level in this thread


On September 06, 1998 at 16:55:48, Don Dailey wrote:

>I think your method is pretty flawed if it depends very much on
>subjective analysis.  For instance, if I did not like a program,
>I would be looking for problems and I would find them.  I would
>dismiss the good moves and make excuses.  I wouldn't do this
>on purpose, I would do it simply because I was human.

We can bet that I am faster than your statistical approach and also more
precise.
my method has only one negative problem: i can only be exact with programs i
have a positive feeling/relationship.

My method works like relationsships with human beeings.
You can only make a positive progress if your relation with the other human
beeings is ok. If the relation is shit, you cannot produce any good results.


>Your method should always be a part of the whole testing philosophy
>however.  You can get into the trap of playing thousands of
>games and not seeing a single one.  I think you must continue
>to do what you do, but other more objective test MUST also be
>performed.

I do statistical stuff too. But it is not done to decide about strength, it is
only done to proof my prejudices or in more poositive saying, to proof my
thesis.

I do not begin with RESULTS. I do only use it for proving the judgement.

Very often the empirical data that is produced after a few months playing gives
the same judgement than my first impression with the program. This is the same
with human beeings. Of course there are a few human beeings that impress you,
and after a while you find out they are liars and assholes and they spent much
time of their lives to act as if they are nice friends.
Chess or humans. It is exactly the same for me.
Believing in friends, believing in programs. Exactly the same.


> And taking a liking to the way a program plays
>is just a whole lot different from knowing if it can get results.
>
>Self testing is an important part of the way we test.  It is
>ideal for some things, not very good for others.  We test all
>sorts of different ways and watch our program play to learn
>what is wrong.   I don't think anyone has found the idea method
>for accurately measuring tiny program improvements.

right. It is all methods together. But still, despite all the different methods,
i believe their is a non empirical, emotional way of doing it, that has also
exact results. But it is difficult to find out WHY it works.

> If you
>give me a way to measure a 1 rating point improvement, I will
>write you a chess program that beats all the others with no
>question about who is strongest.
>
>Larry had a theory that self testing exaggerates improvements.
>If you make a small improvement, it will show up more with
>self testing.  If this is true (we called it a theory, not a
>fact) then this is a wonderful way to test because small
>improvements are so hard to measure and this exaggerates them.

it can also IMO lead you into a completely wrong direction.
when the new feature is only working against x and not against all other
programs y. Or against humans. You think you have an improvement, and in fact it
is only that it is improvement related to x.

>We know that self testing won't expose our program to weaknesses
>that it itself cannot exploit.  That's why we consider it as
>a supplement to other kinds of testing.

no problem with this point of view. My point was: that self-testing alone will
not work. in the same way test-suites alone do not show you anything.
It is the whole stuff. Together. And still it is complicate enough and takes
time. Enough.

>- Don



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.