Author: Howard Exner
Date: 09:52:36 10/03/99
Go up one level in this thread
On October 03, 1999 at 12:25:09, James B. Shearer wrote: >On October 03, 1999 at 11:52:43, Howard Exner wrote: > >>On October 03, 1999 at 09:17:38, Georg v. Zimmermann wrote: >> >>>The game against Hoffmann should _of course_ be counted. Say what happens if I >>>play a game with a cold in a tourniament ? >> >> >>This is not quite the analogy that comes to mind. A computer that shorts out, or >>has a power failer is more like a person having a total stroke or blackout >>during the game - or maybe having someone bonk you over the head causing an >>unconscious state. >> >>Or course if you are concerned about the game score then of course even >>someone dying at the chess table will not matter. But in the GM challenge >>the point is seeing how a computer program plays vs a human. Otherwise we may >>find ourselves with a rash of posts, "I beat Crafty in 10 moves!" When asked >>by the enquiring minds here on CCC, "How did you do that?", you could >>simply reply, "The power went out in my house, it refused to move so it >>lost on time! Yipee my rating just shot up!" > > Rebel did not lose on time. > Obviously the game should count. In any scientific experiment, >arbitrarily throwing out data points is forbidden because it can easily >introduce biases that destroy the validity of the results. >Any points thrown >out should be on the basis of a protocal establised before the experiment >starts. Protocol is important in gathering data. What is the protocol for the GM challenge? What is being measured? If that being tested or measured is the game result - ones and zeros - then the game is binding as it falls into the protocol. If that being tested is a program running on healthy hardware then the protocol falls short. Isn't the spirit of the challenge the moves of the game? What is the GM trying to exploit, how is the computer handling this strategy, is the game a lopsided crush, is the GM working hard to press for victory. All the drama of the game seems to me the point of the GM challenge, and a game seems void if the hardware is sputtering. Take for example the SSDF protocol. When it comes to their attention that faulty settings were used (book, selectivity, default) they are quick to toss the games out as invalid. Other comp - comp testers do the same. Where is the line drawn for the Rebel - Hoffmann game? Numerous rebooting of the hardware ok? Complete power outage ok? The human suffering a stroke ok? I do respect the opinions of those that want to count the game as a valid measure of Rebel's strength vs a Grandmaster. Consensus is not important. Once a larger number of games are played the point spread of the two ratings (counting Hoffmann or not) will diminish anyway. >Experience has shown humans are generally incapable of making objective >decisions about this sort of thing. That is why double blind experiments were >invented. > James B. Shearer
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.