Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Knee jerk reaction!

Author: Robert Hyatt

Date: 17:49:30 09/12/04

On September 12, 2004 at 18:04:53, Sune Fischer wrote:

>On September 12, 2004 at 17:39:26, Robert Hyatt wrote:
>
>>>I don't believe it's just one big fuzzy thing that can't be seperated.
>>
>>I consider a "computer chess playing entity" to be just that.  The sum of _all_
>>the parts.  If you want to test individual parts, fine by me.  But what you are
>>testing has _nothing_ to do with how the complete "entity" plays chess.
>
>I won't go too much into this again, I think Sandro and I already have discusses
>this :)
>
>> It
>>won't predict how well it analyses.
>
>Why not?

Because you force it into positions it won't normally play, against programs
that also get forced into odd positions they won't normally play.  How will
making decisions about that tell you anything about which is the best for
analysis???

>
>You mean to say that giving it a very strong book or very weak book will make it
>_easier_ to see how strong it is in analysis?
>That makes absolutely no sense at all.

No, giving it _the_ book customized for it will tell you how well it can play.
Giving it an odd book most likely will weaken the thing.  I don't see how you
can use _games_ to predict how well a program can do in analysis...  The two are
not related directly.  I've known plenty of strong players that couldn't explain
a thing, and weaker players that could point out problems very clearly...

>
>>How well it will do against humans or
>>computers.  Or anything else...
>
>No but it will reveal its strong and weak points which might be of interest to
>the user.

What user is qualified to figure that out?  When the programs are so much
stronger than 99.9% of the humans that are trying to figure this out...

IE bozos can't really decide which brain surgeon is the best...

>
>>>
>>>>Why does it matter how Fritz does with a bad book?
>>>
>>>Suppose we gave the Fritz book to a strong amateur engine, Aristarch for
>>>instance and in a very long match it beated Fritz.
>>>
>>>That would obviously be interest for several reasons.
>>
>>
>>No more so than if someone builds a new book from a PGN collection and produces
>>the same kind of result.  If the goal is to find the best book for program X,
>>then such a test would make sense.  But that is _not_ the goal that is being
>>discussed.
>
>You are the one who has been saying it makes _no_ sense to play without own
>books, I'm just trying to show you that there are reasons to play without them.

I haven't seen a single good example of how/why doing this provides useful
information.

>
>My job is easy, I just need one single counter example :)

No, you have it backward.  You are saying something is OK.  I have given more
than one counter-example of why it is _not_ ok...  You can't give one example of
why it _is_ ok and then conclude "it is ok."

>
>>It is taking hokey positions, making programs play them against each
>>other, and then trying to draw conclusions from that.  The two are _not_ the
>>same thing.
>>
>>Ditto for learning on/off, pondering on/off, etc...
>
>I disagree.

Then we just have to agree to disagree.  My experience leads me to one
conclusion.  Based on writing several programs, playing in all sorts of
competitions, etc...

>
>>>>No endgame tables?
>>>
>>>There is no room for endgame tables on his laptop.
>>
>>Baloney.  I have a sony VAIO with a 20 gig hard drive.  I have _all_ the 3-4-5
>>piece files on it...  20 gig drives are small today.
>
>I have a 10 GB drive and it is full.

You made that choice.  You _could_ get the 5 piece tables on there if you
_wanted_ them.  That is the point...  It isn't a matter of "can't".  It is a
matter of "don't want to".

>
>To take another example, how are you going to use endgame tables on the
>PocketPC?`
>http://www.pocketgear.com/software_detail.asp?id=15142

In 5 years the answer will be obvious.. :)

>
>>>
>>>>Impossibly short time controls?
>>>
>>>He needs to analyse 50000 games.
>>
>>For what possible reason that makes any sense???
>
>Ask him, I won't be the judge of what people should and shouldn't do.
>
>>>> No pondering?
>>>
>>>He has a single CPU machine.
>>
>>
>>So?  I do ponder=on matches on my single-cpu laptop all the time.  No problems
>>at all
>
>How do you make sure they get 50% cpu each?

I don't.  I trust the O/S to do that.  I just watch something like "top" to be
sure it is correct most of the time.  If one chooses to not ponder for some
reason, oh well...

>
>What happens when one engine hits ETGB or runs a high priority thread?

You aren't going to run a "high-priority" thread on a real O/S, unless you are
running as a privileged user.  If so, that is so far beyond stupid as to not
need any explanation.  Easy way to lose the whole system, so it shouldn't be
done.  Of course you should not put your foot under a running lawn mower either.
 You can, but you shouldn't.

>
>
>>>>  No learning?
>>>
>>>He wants reproducable results.
>>
>>He wants meaningless results you mean.
>
>I can believe the low regard you hold on reproducability, it is just the
>foremost important property of any experiment.
>
>How do you measure progress without reproducability?

If trying to find out which program is better, A or B, reproducibility is _not_
an issue.  Do you _really_ think that if you play me as a human, that I am going
to play the same moves every time you do?  Yet even in spite of that lack of
reproducibility, you can't tell whether you are better than I am?
Reproducibility is great for debugging.  Not necessary for strength
measurements.

>
>Say he wants to see how much changing the hash size means for Crafty - he can't
>conclude anything due to the learning.

So how are you going to get reproducibile results with crafty?  My book _always_
has a randomness element in it.  The _search_ has a random element since it is
based on processor timing info that can vary from one game to another by a few
fractions of a second each move, which can have an impact on moves chosen by the
search.

>
>Say he changes some evaluation parameters and wants to see if Crafty plays
>better - he can't conclude anything due to the learning.
>

Then he can't learn anything at all as there is no reproducibility in Crafty if
the book is used.  So use the best book with learning turned on to get the
_best_ non-reproducible result.

>etc...
>
>>Suppose one person hand-tunes their
>>book.  The other chooses to go the book-learning route instead.  This test is
>>therefore flawed in a most basic way.
>
>Not in testing analysis power.

You haven't given one practical idea for finding out which engine is best for
analysis.  I don't begin to buy "the engine that does the best on random
position s" because that is _not_ true for humans.  And it isn't true for
computers either.

>
>>>
>>>>Why not test with "no code" as well???
>>>
>>>He already knows how strong that would play.
>>
>>
>>Apparently not.
>
>You forgot a smiley :)
>
>>>
>>>>>Suppose the book is worth 100 Elo and Fritz is the only one who is allowed to
>>>>>use that book, now obviously Fritz will look 100 Elo stronger in all matches
>>>>>than it really is, and obviously these 100 Elo are worth nothing to a
>>>>>correspondence player who only needs the engine for analysis.
>>>>
>>>>Au Contrare, Fritz will be giving _good_ opening advice, for one thing...
>>>
>>>I think the GUI+book will be doing that.
>>
>>So?  That is, by definition, "Fritz".
>
>Ehmm, so it is the same program competing several times in WCCC?

Don't know what you mean.  Fritz is Fritz.  Or Quest.  Or whatever.  I've
already gone on record as saying one Fritz GUI should participate since it does
opening book selection, handles table positions at the root, etc.  But that's
another subject...

>
>>>
>>>>And if you expect _any_ program to give good advice on oddball openings, good
>>>>luck...
>>>
>>>I expect a program do the best it can, even in objectively lost situations.
>>>There is honour in fighting for a draw as well :)
>>>
>>>-S.
>>
>>Certainly, but I don't plan on testing in every possible kind of position.  I
>>just avoid the ones that don't look particularly reasonable and leave it at
>>that.  It works...
>
>It works for playing a quick tournament.
>
>For the long run development and to be strong in general analysis I think it is
>interesting to investigate and improve the weak points also.
>
>-S.

I wouldn't disagree.  But sometimes a strong point and a weak point are
orthogonal to each other.  You can't do both well.  So you pick one to do well,
and avoid the other.

Re: Knee jerk reaction! Sune Fischer 02:17:39 09/13/04
- Re: Knee jerk reaction! Robert Hyatt 07:34:55 09/13/04
  - Re: Knee jerk reaction! Sune Fischer 11:18:01 09/13/04
    - Re: Knee jerk reaction! Sandro Necchi 12:37:04 09/13/04
      - Re: Knee jerk reaction! Sune Fischer 13:17:22 09/13/04
        
        Re: Knee jerk reaction! Sandro Necchi 13:38:56 09/13/04
        
        Re: Knee jerk reaction! Sune Fischer 14:52:33 09/13/04
        
        Re: Knee jerk reaction! Sandro Necchi 07:00:54 09/14/04
        
        Re: Knee jerk reaction! Sune Fischer 12:51:24 09/14/04
        
        Re: Knee jerk reaction! Sandro Necchi 13:31:28 09/14/04
        
        Re: Knee jerk reaction! Martin Slowik 01:58:30 09/14/04
        
        Re: Knee jerk reaction! Sune Fischer 12:30:58 09/14/04
- Re: Knee jerk reaction! Uri Blass 05:07:55 09/13/04

This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.