Computer Chess Club Archives


Search

Terms

Messages

Subject: The data isn't really very skewed

Author: KarinsDad

Date: 21:59:25 02/28/99

Go up one level in this thread


On February 28, 1999 at 14:08:51, Mark Young wrote:

>On February 28, 1999 at 12:05:46, KarinsDad wrote:
>
[snip]
>>Why does this matter? In the USCF, you have GMs who will only play in Open
>>sections, GMs who will only play in Closed sections, and GMs who will play in
>>any section. Do you make the claim that their USCF rating should be different
>>based on what type of (equal time, but different types of opponent) tournament
>>that they used to get their rating? No. You give them their due regardless of
>>what type of opponents they used to acquire their rating. The same should apply
>>here.
>>
[snip]
>
>It matters because it skews the ratings. When you can pick whom you will play
>and not play, no matter how it is done, skews the ratings.

The skewing will not really be significant. Let's take some examples:

1) A GM only plays one other GM on the chess server. This is slightly skewed as
the first player will be rated based solely on the ability of the second.

2) A GM only plays one computer program on the chess server. This is same as
number 1.

3) A GM only plays GMs on the chess server. This is not skewed as the first
player will be rated based on the ability of many players.

4) A GM only plays multiple computer programs on the chess server. This is same
as number 3.

5) A GM only plays multiple computer programs and multiple humans on the chess
server that he can beat (as per your earlier post Mark). This would be skewed,
but I doubt it happens often. Since GMs play chess for a living, the tendency
for them would be to play challenging opponents in order to stay in fighting
trim.

6) A GM plays only one or a few other GMs and only one or a few other computer
programs. This is only slightly skewed since the GM is playing multiple
opponents

7) A GM plays whomever he can. This is not skewed at all.

8 through 14) replace "A GM" with "A computer program". #12 becomes a program
who only plays lower rated programs and humans. Again, why? To boost the ego of
the program writer?

So, this is a good majority of the cases. Only #1, #2, #5, #6, #8, #9, #12, and
#13 are skewed at all and then only slightly (with the exceptions of #5 and #12
which should happen infrequently). Overall, with a large number of games, the
ratings should be reasonably valid. As with any set of data and statistics, you
can either use what you have and make reasonable approximations, or you can
attempt to invalidate it since it is not in a totally "controlled" environment.

> I don’t mean a player
>deciding to play only in the closed section of a tournament. An example would be
>GM Kasparov deciding never to play GM Anand, because he knows this is the only
>player that can beat him, so even in the same tournament they don’t play,
>thereby keeping his rating higher and GM Anand rating lower.

Taking the extreme case (at the end of the spectrum) and pointing out that if
this player did something similar to what may or may not be happening in a high
percentage of cases on the chess servers isn't quite valid. For one thing, there
are several GMs playing the computer chess programs on the chess servers. These
same GMs also play against other GMs and other lower rated players. Although
some of the GMs are avoiding the programs, that does not invalidate their
ratings or their ability or the ratings or abilities of the GMs who do not avoid
the programs. Also, you could limit your data collection to those GMs who do not
avoid the chess programs if you wanted to make it less "skewed" or more
"controlled" (as was done in the original post).

KarinsDad

>
>Mark
>
>
[snip]



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.