Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Comet A.96 - Wcrafty15.20 20 games blitz match

Author: Robert Hyatt

Date: 05:22:24 10/22/98

Go up one level in this thread


On October 22, 1998 at 03:03:19, Dave Gomboc wrote:

>On October 21, 1998 at 17:29:00, blass uri wrote:
>
>>
>>On October 21, 1998 at 16:29:01, Robert Hyatt wrote:
>>
>>>On October 21, 1998 at 09:18:57, blass uri wrote:
>>>
>>>>
>>>>On October 21, 1998 at 08:07:53, Robert Hyatt wrote:
>>>>
>>>>>On October 21, 1998 at 04:37:38, Nouveau wrote:
>>>>>
>>>>>>
>>>>>>On October 20, 1998 at 12:13:16, Dann Corbit wrote:
>>>>>>
>>>>>>>On October 20, 1998 at 10:37:36, Nouveau wrote:
>>>>>>>>On October 20, 1998 at 01:36:22, Jouni Uski wrote:
>>>>>>>>
>>>>>>>>>Here's result for 20 games match with 60/5 time limit (under Winboard):
>>>>>>>>>
>>>>>>>>>Comet    0.5 0 1 0 0 0 1 1 0 1 0 0 0.5 0 0.5 1 0.5 0 1 0   = 8
>>>>>>>>>Wcrafty  0.5 1 0 1 1 1 0 0 1 0 1 1 0.5 1 0.5 0 0.5 1 0 1   = 12
>>>>>>>>>
>>>>>>>>>So they are very close to each other in playing strength.
>>>>>>>>>
>>>>>>>>>Jouni
>>>>>>>>
>>>>>>>>12-8 is very close ??????????
>>>>>>>>
>>>>>>>>When can we say : Crafty is better than Comet ? 18-2 ?
>>>>>>>>
>>>>>>>>I don't understand these statistical stuff : I can't imagine a 12-8 result in a
>>>>>>>>match between 2 GM with a conclusion like "They are very close in playing
>>>>>>>>stregth".
>>>>>>>>
>>>>>>>>Why do we need hundreds, maybe thousands of games between computers to evaluate
>>>>>>>>relative strength, when few dozens are more than needed for human GMs ?
>>>>>>>Any strong conclusion from a single match is faulty.  It could be that Comet is
>>>>>>>500 points above Crafty, or 500 points below (although both of these are
>>>>>>>statistically very unlikely, really, very little has been demonstrated at this
>>>>>>>point from a single set of games).
>>>>>>
>>>>>>Just imagine : the match between Kasparov and Chirov takes place and the result
>>>>>>is : Kasparov-Chirov = 12-8.
>>>>>>Maybe Kasparov is 500 points above Chirov or 500 points below...Show me any
>>>>>>chess magazine that would print such an affirmation.
>>>>>>I know, those chess journalists don't have a clue on science and stats ;o)
>>>>>>
>>>>>>> The international chess bodies like FIDE
>>>>>>>have definitely got it right in the way that they perform evaluations using the
>>>>>>>ELO method.  Also, in requiring a long period of excellent results to become a
>>>>>>>GM.
>>>>>>
>>>>>>Can someone make the math for this : a player has a 2600 level but no rating,
>>>>>>how many games against a 2500 opposition does he need to reach 2600 ?
>>>>>>
>>>>>
>>>>>
>>>>>easy here.  one game.  his rating would be 2700 after that one game, since
>>>>>the first N games uses the usual "TPR" type calculation.
>>>>
>>>>after 1 game you have no rating
>>>>you need at least 9 games to have a rating (not important if it is 2005 or 2700
>>>>
>>>>players who have 2005 rating need at least 30 games if you assume they cannot
>>>>earn more than 20 elo in one game.
>>>>
>>>>Uri
>>>
>>>
>>>No idea about FIDE rules about ratings, but the USCF publishes ratings after a
>>>single tournament.  I have known players with ratings like this:  2244/4, which
>>>means provisional with only 4 games played so far.  And during the provisional
>>>period, the rating can fluctuate dramatically because the formula is simply
>>>the sum of the ratings of the opponents you beat (+400 for each one) plus the
>>>sum of the ratings of the opponents that beat you (-400 for each one) plus the
>>>sum of the ratings of the opponents that you drew, divided by the total games
>>>counted.  IE performance rating.  And in that light, beating a GM gives you his
>>>rating+400 after one game...
>>>
>>>And I assume most use K=32 nowadays (at least the ratings I have seen do this)
>>>which means you can go up/down up to 32 points in one game...
>>
>>I remember that in fide rules K is not constant and it is bigger for players
>>that did not play many games.
>>
>>I do not remember the exact rules but I do not know about K=32 in fide rules and
>>I think that K is smaller for everyone.
>>
>>I know that I needed at least 9 games against rated players to get a fide
>>rating.
>>
>>I needed 2 tournaments for a fide rating because in the first tournament I had
>>not 9 games against fide rated players.
>>
>>Uri
>
>USCF minimum might be 3 or 4 games.  CFC minimum is 4 games before they'll
>publish a provision rating.  FIDE wants 9 games.
>
>If I recall correctly, FIDE's K factor was 12 back when it was constant.  This
>is a far cry from the USCF's 32.  (The CFC used 32 when <2300, then 16 when
>>2300.  Recently they diddled with their formulas, I didn't pay much attention
>to exactly how, but they did move the cutoff to 2200.)
>
>At any rate, all interesting modern hardware and software combinations are
>strong enough that if you were going to treat them as players that could vary in
>skill over time, a K factor of at most 12 would be appropriate.  Since the
>variablility of computer play is so low relative to the variability of human
>play, perhaps a K of 4 (hand-wave) would be more realistic.
>
>Dave Gomboc


Actually, I would change K differently.  It was originally set at 32
because this gave reasonable rating changes when people might play one
rated event a month, at most.  Not bad...

Now, particularly on servers, people can play 10-20 rated games in an
evening... and a K=4 or whatever would definitely be an improvement, since
your rating won't fluctuate nearly so wildly, and it shouldn't when you
think about it.

IE I think K should be set (like the Glicko system) based on the number/
frequency of games...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.