Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Next Human vs Computer ratings list - I need opinions

Author: Robert Hyatt

Date: 15:18:44 05/19/00

Go up one level in this thread


On May 19, 2000 at 14:58:01, blass uri wrote:

>On May 19, 2000 at 13:26:26, Robert Hyatt wrote:
>
>>On May 19, 2000 at 12:41:57, Enrique Irazoqui wrote:
>>
>>>On May 19, 2000 at 12:22:05, Ed Schröder wrote:
>>>
>>>>On May 19, 2000 at 11:05:45, Enrique Irazoqui wrote:
>>>>
>>>>>On May 19, 2000 at 10:58:57, Robert Hyatt wrote:
>>>>>
>>>>>>On May 19, 2000 at 10:27:04, blass uri wrote:
>>>>>>
>>>>>>>On May 19, 2000 at 09:42:07, Enrique Irazoqui wrote:
>>>>>>>
>>>>>>>>On May 19, 2000 at 09:37:19, Chris Carson wrote:
>>>>>>>>
>>>>>>>>>I am planning to publish an updated list list here with
>>>>>>>>>all rated human vs computer results for 40/2 events.
>>>>>>>>>
>>>>>>>>>Please let me know your thoughts on the following:
>>>>>>>>>
>>>>>>>>>1.  Exclude Performance Rating when 3 or fewer games
>>>>>>>>>    have been played by a program/hardware.
>>>>>>>>
>>>>>>>>I don't see why.
>>>>>>>>
>>>>>>>>>2.  Exclude forfiets and protest resignations (Dutch Championship),
>>>>>>>>>    and games where computers lost due to hardware, IP failures,
>>>>>>>>>    or operator error.
>>>>>>>>
>>>>>>>>I would definitely exclude forfeits and IP failures, but not the rest. In my
>>>>>>>>opinion, this list is interesting if it reflects the real performance of
>>>>>>>>programs in actual games. Hardware failures and operator's errors are part of
>>>>>>>>how a program plays. Forfeits and IP failures are not.
>>>>>>>>
>>>>>>>>Enrique
>>>>>>>
>>>>>>>Do you really think that losing on time is part of how shredder4 plays?
>>>>>>>
>>>>>>>I do not agree.
>>>>>>>I think that operator's error are not part of how a program plays and it is not
>>>>>>>fair to include the game that shredder lost on time in a winning position when
>>>>>>>the reason was not a bug in the program.
>>>>>>>
>>>>>>>Uri
>>>>>>
>>>>>>
>>>>>>Depends on your definition of "How Shredder plays".  If you mean how it plays
>>>>>>in human events, then the answer is "yes".  Because the operator _will_ make a
>>>>>>mistake here and there.  Resigning when there is a deep saving move that the
>>>>>>program might have played without understanding it.  Losing time on the clock
>>>>>>by going to the bathroom.  Etc. The human operator _is_ part of the "system"
>>>>>>until we start using robots controlled by the computer.
>>>>>>
>>>>>>I have made mistakes (as an operator) that ending up costing Cray Blitz a game
>>>>>>here and there.  In the WMCCC event in Jakarta, the operator misunderstood how
>>>>>>to set the time control and set it for 40 moves in 2 days, not 40 moves in 2
>>>>>>hours.  We lost the first game that way.  If you have a human in the loop, then
>>>>>>he has to be factored in.  As does hardware failures which _do_ happen in games.
>>>>>>
>>>>>>In fact, bleeding edge hardware is dangerous to use for this reason.
>>>>>
>>>>>This was my first reaction too, but I remember reading here that the operator of
>>>>>Shredder in the last round of the Israeli league lost on time almost on purpose,
>>>>>making telephone calls, not caring about the program, etc. So it is an
>>>>>exceptional case that in my opinion makes the game irrelevant for rating
>>>>>purposes.
>>>>
>>>>I understand the point you are making. The very same thing happened in
>>>>the 2 games Rebel8 played against GM Ralf Akesson. Rebel8 won the first
>>>>game and lost the second game on time due to an operator error in a
>>>>promising position. Make an exception? No way IMO. The next thing a GM
>>>>loses on time in a won position because his wife gave birth and he went
>>>>home. The list of exceptions soon becomes endless. We need a clear rule.
>>>
>>>Sure, but if the purpose of this rating list is to give us an idea of the
>>>strength of programs, I would discard games that we know are meaningless, like
>>>the 2 forfeits of Fritz in Holland and this Shredder game. The key word, to me,
>>>is "meaning", and this game has none. The list may be more complicated, but also
>>>more accurate.
>>>
>>>Enrique
>>>
>>>>Ed
>>>>
>>>>
>>>>>Enrique
>>
>>
>>Forfeits are "non-events" since they weren't played (even the 4 mover was a
>>non-game for obvious reasons).  But operator errors are part of the computer
>>"system".  And you can't always have a "perfect" operator.  The best solution
>>is that the author is the _only_ one that operates.
>
>I do not agree.
>Part of the decisions of the operator is if to agree to a draw or not to agree
>to a draw and if to offer a draw or not to do it.
>
>I do not think that the decisions of the author are always better.
>
>Uri


I am not talking about _decisions_.  I would instruct _any_  operator to do
the following:

(a) if the opponent offers a draw, type "draw" and see if Crafty accepts or
declines.  It has the final say.

(b) do not resign or offer a draw unless instructed to do so by Crafty.  It
will offer a draw ("I offer a draw") or it will resign ("Crafty resigns")
all by itself.

So for Crafty, there are _no_ human decisions to make.  But there _are_ things
that can go wrong.  Wrong time control set.  Wrong move made which backs the
game up and penalizes Crafty for the lost time.  Etc.

I believe all of those are part of the "game" since they are going to happen
in a certain percentage of games, no matter what...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.