Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Next Human vs Computer ratings list - I need opinions

Author: Dave Gomboc

Date: 12:50:46 05/19/00

Go up one level in this thread


On May 19, 2000 at 13:29:05, Ed Schröder wrote:

>On May 19, 2000 at 13:26:26, Robert Hyatt wrote:
>
>>On May 19, 2000 at 12:41:57, Enrique Irazoqui wrote:
>>
>>>On May 19, 2000 at 12:22:05, Ed Schröder wrote:
>>>
>>>>On May 19, 2000 at 11:05:45, Enrique Irazoqui wrote:
>>>>
>>>>>On May 19, 2000 at 10:58:57, Robert Hyatt wrote:
>>>>>
>>>>>>On May 19, 2000 at 10:27:04, blass uri wrote:
>>>>>>
>>>>>>>On May 19, 2000 at 09:42:07, Enrique Irazoqui wrote:
>>>>>>>
>>>>>>>>On May 19, 2000 at 09:37:19, Chris Carson wrote:
>>>>>>>>
>>>>>>>>>I am planning to publish an updated list list here with
>>>>>>>>>all rated human vs computer results for 40/2 events.
>>>>>>>>>
>>>>>>>>>Please let me know your thoughts on the following:
>>>>>>>>>
>>>>>>>>>1.  Exclude Performance Rating when 3 or fewer games
>>>>>>>>>    have been played by a program/hardware.
>>>>>>>>
>>>>>>>>I don't see why.
>>>>>>>>
>>>>>>>>>2.  Exclude forfiets and protest resignations (Dutch Championship),
>>>>>>>>>    and games where computers lost due to hardware, IP failures,
>>>>>>>>>    or operator error.
>>>>>>>>
>>>>>>>>I would definitely exclude forfeits and IP failures, but not the rest. In my
>>>>>>>>opinion, this list is interesting if it reflects the real performance of
>>>>>>>>programs in actual games. Hardware failures and operator's errors are part of
>>>>>>>>how a program plays. Forfeits and IP failures are not.
>>>>>>>>
>>>>>>>>Enrique
>>>>>>>
>>>>>>>Do you really think that losing on time is part of how shredder4 plays?
>>>>>>>
>>>>>>>I do not agree.
>>>>>>>I think that operator's error are not part of how a program plays and it is not
>>>>>>>fair to include the game that shredder lost on time in a winning position when
>>>>>>>the reason was not a bug in the program.
>>>>>>>
>>>>>>>Uri
>>>>>>
>>>>>>
>>>>>>Depends on your definition of "How Shredder plays".  If you mean how it plays
>>>>>>in human events, then the answer is "yes".  Because the operator _will_ make a
>>>>>>mistake here and there.  Resigning when there is a deep saving move that the
>>>>>>program might have played without understanding it.  Losing time on the clock
>>>>>>by going to the bathroom.  Etc. The human operator _is_ part of the "system"
>>>>>>until we start using robots controlled by the computer.
>>>>>>
>>>>>>I have made mistakes (as an operator) that ending up costing Cray Blitz a game
>>>>>>here and there.  In the WMCCC event in Jakarta, the operator misunderstood how
>>>>>>to set the time control and set it for 40 moves in 2 days, not 40 moves in 2
>>>>>>hours.  We lost the first game that way.  If you have a human in the loop, then
>>>>>>he has to be factored in.  As does hardware failures which _do_ happen in games.
>>>>>>
>>>>>>In fact, bleeding edge hardware is dangerous to use for this reason.
>>>>>
>>>>>This was my first reaction too, but I remember reading here that the operator of
>>>>>Shredder in the last round of the Israeli league lost on time almost on purpose,
>>>>>making telephone calls, not caring about the program, etc. So it is an
>>>>>exceptional case that in my opinion makes the game irrelevant for rating
>>>>>purposes.
>>>>
>>>>I understand the point you are making. The very same thing happened in
>>>>the 2 games Rebel8 played against GM Ralf Akesson. Rebel8 won the first
>>>>game and lost the second game on time due to an operator error in a
>>>>promising position. Make an exception? No way IMO. The next thing a GM
>>>>loses on time in a won position because his wife gave birth and he went
>>>>home. The list of exceptions soon becomes endless. We need a clear rule.
>>>
>>>Sure, but if the purpose of this rating list is to give us an idea of the
>>>strength of programs, I would discard games that we know are meaningless, like
>>>the 2 forfeits of Fritz in Holland and this Shredder game. The key word, to me,
>>>is "meaning", and this game has none. The list may be more complicated, but also
>>>more accurate.
>>>
>>>Enrique
>>>
>>>>Ed
>>>>
>>>>
>>>>>Enrique
>>
>>
>>Forfeits are "non-events" since they weren't played (even the 4 mover was a
>>non-game for obvious reasons).  But operator errors are part of the computer
>>"system".  And you can't always have a "perfect" operator.  The best solution
>>is that the author is the _only_ one that operates.  As he is the least likely
>>to make an error that influences the game outcome.  But as soon as you use other
>>operators, the probability of error increases.  And as it increases, the chances
>>that the program will perform somewhat below "expectation" become greater.
>>
>>But that is part of the "system" IMHO.  Otherwise you start with "It lost that
>>game due to a power failure that lost some pondering time."  "It lost this game
>>due to a hardware glitch that made me reboot and restart, losing information."
>>"It lost that game due to an operator typo that made it have to back up and lose
>>stuff."  "It lost that game because the hardware crashed and wouldn't come back
>>up."
>>
>>Etc.
>>
>>All of those are part of computer chess.  The human has his own set of problems
>>to contend with, and he can't escape them either.
>
>Totally in agreement.
>
>Ed

I am not, e.g. Bosboom's resignation is an indicator of resistance to forcing
humans to play computers in chess events.  Factoring those results into the
rating of the computer is more accurate than ignoring the case as an exception.
If the occurance is truly an exception, the difference from rating that game
will become negligible as more games are played.  However, if it becomes
standard practice that humans cannot choose to avoid playing computers without
penalty, and protest forfeits such as Bosboom's become a regular feature of
tournaments, then the probabilistic outcome calculated from the ratings of the
human and computer will be more accurate if the possibility that a human will
resign against the computer before entering into a serious contest is taken into
account.

Dave



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.