Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Computer accounts DO cause ratings inflation

Author: Robert Hyatt

Date: 01:13:12 03/18/99

Go up one level in this thread


On March 18, 1999 at 00:06:12, Mark Young wrote:

>On March 17, 1999 at 22:18:32, Robert Hyatt wrote:
>
>>On March 17, 1999 at 14:14:05, Mark Young wrote:
>>
>>>On March 17, 1999 at 13:31:57, Robert Hyatt wrote:
>>>
>>>>On March 17, 1999 at 08:56:49, Albert Silver wrote:
>>>>
>>>>>This whole story about Mark's account screwing up the ratings on Chess4u has
>>>>>been somewhat interesting. No doubt a few will disagree. The reason is that NO
>>>>>ONE except for Hyatt, though for different reasons, actually gave any credence
>>>>>to this. Chess4u is right, but not about Mark. The accounts that inevitably
>>>>>cause inflation are the ones that use more than one program or accounts where a
>>>>>lot of testing is done. Suppose I have, as Mark did, Hiarcs 7 running on a
>>>>>PII-450 and it gets an official rating of 2800. No problem as it is indeed
>>>>>playing at that level and it's results correspond accordingly. Now suppose after
>>>>>about 2 months, I see the latest version of GNU chess out. The author claims it
>>>>>is vastly improved and should be playing much better, though no one knows just
>>>>>how much. I decide to test it with my account. GNU chess is not a 2800 player,
>>>>>but when testing starts it is playing with a 2800 rating. It gets trounced by
>>>>>the super opposition and the rating drops until it stabilizes at around 2300. I
>>>>>am not personally worried as after the testing is done, H7 will obviously regain
>>>>>it's lost points. The problem is that 500 points were spread out in the pool and
>>>>>they don't properly represent an increase in strength on the opponents' part.
>>>>>When I get back, I don't go to 2800, but a bit higher as I am now playing the
>>>>>same opponents, but with slightly higher ratings. If a program undergoes
>>>>>testing, and experiences severe rating fluctuations while it is being tested,
>>>>>then the same phenomenon takes place. Bob is obviously already aware of this as
>>>>>his notes to his Crafty account on ICC state that opponents who clearly play him
>>>>>ONLY when Crafty's rating is high but never when it is at a low, will be
>>>>>'noplayed'.
>>>>>
>>>>>                                   Albert Silver
>>>>
>>>>
>>>>This is a problem that the 'operators' often don't consider.  IE it is _really_
>>>>unfair to have a 2300 rating with a 2800 program.  The other case is bad in that
>>>>it is going to skew ratings, but this case is _really_ bad because anyone that
>>>>plays that 2300 player will likely get crushed at a rate comparable to what
>>>>would happen with a 2800 opponent.  And that causes some gross hard feelings.
>>>>
>>>>This was the point I was trying to make with Mark...
>>>
>>>I understood your point, but it was not to the point in my case with Chess4You.
>>>I only used 1 Program, I only Played 11 games, and I played the strongest
>>>players in the ratings pool, and more then 1 player.
>>>
>>>Mark Young
>>>
>>>
>>
>>
>>good, because I intended no 'put-down' at all.  But we are at a new 'era' where
>>almost all computer programs can blow off GM players at blitz, many can blow
>>them off at action, and it won't be all that long before we blow them off at
>>40/2hr.
>
>I agree, I just did not understand why you brought this up in defense of the
>admins at Chess4You. I was not playing humans at all, but the two strongest
>computer programs on their server. Your comment seemed pointless to that post,
>but I do agree with most of it.
>
>I may not have been clear that I was playing strong programs. And I in no way
>was blowing the rating of other human players. As I was provisional and did not
>change any programs or humans rating. At this point the server was just trying
>to find what Hiarcs7 rating was.
>


never having been there, and not planning on going there, I don't know the
admins... and didn't intend to 'support' them.  As I don't know enough.  I
was explaining _how_ ratings can be abused.  And every time "we" (we ==
computers) do something like that, it is one more black mark against "us" (us
== the collective set of computers on a server).  Just like USCF/FIDE, we only
have so many boo-boos before our collective time is 'up'.



>
>
>>
>>In 1975 the only people fighting computers were the 14-1500 players, because
>>everyone else could beat them.  Then by 1980 it was up to the expert ranks.
>>In 1981 we had belle and cray blitz and now the master's were getting thumped
>>and joined the bandwagon.   It is only a matter of time before the GM's say
>>'enough' and _that_ will definitely be _that_ I am afraid...
>>
>>>  We are in a _very_
>>>>_fragile_ state right now.  Computers are already effectively banned from normal
>>>>tournaments.
>>>
>>>
>>>  It won't take a lot before they are banned from servers.  I think
>>>>we have to be _very_ cautious or we are going to lose what has been the most
>>>>remarkable development environment I have seen in 30 years of doing this.
>>>>
>>>>I think that if someone told me "Hey, don't match and kill low-rated programs"
>>>>that I would simply "not match them, as asked."  (I don't match them anyway so
>>>>this is actually moot).  But there are times to fight back, and times to turn
>>>>the other cheek.  In light of the 'mood' concerning computers playing chess
>>>>today, I think 'caution' is required.  Because once the servers start saying
>>>>"OK, we've had enough of this rating manipulation stuff, enough complaings from
>>>>titled players getting challenged by computers, enough of all of this, so say
>>>>good-bye, computers, and get off this server."  And anybody that doesn't think
>>>>that can/will happen is poorly informed and ought to look over the delegate's
>>>>meeting discussions in old CL&R's and so forth.  I was _there_ for a couple,
>>>>and in 1984 it was pretty obvious to me where computers were headed: _out_.
>>>>And out we went.
>>>>
>>>>I cause some problems with Crafty, because my rating can fluctuate from 2700-
>>>>nearly 3100.  And that is a wide swing.  I try to avoid putting 'garbage
>>>>versions' on ICC/FICS/etc, but I do make mistakes.  Or hardware problems will
>>>>kill it.  And that definitely causes problems.  Fortunately, since crafty is
>>>>100% 'passive' and _never_ matches anyone unless they specifically ask me to
>>>>do so, it doesn't generate complaints.  If you stick your hand in a blender,
>>>>you really can't blame the blender manufacturer for what happens.  :)
>>>>
>>>>However, there have been _many_ manual operators that have been 'banned' from
>>>>servers like ICC for various forms of 'abuse'.  I only hope we don't all get
>>>>'class-banned' to avoid the headaches.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.