Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Issue regarding GM strength

Author: Robert Hyatt
Date: 07:35:37 10/27/01
On October 27, 2001 at 02:26:40, Sune Larsson wrote:

>On October 26, 2001 at 22:43:50, Robert Hyatt wrote:
>
>>On October 26, 2001 at 16:12:29, Sune Larsson wrote:
>>
>>>On October 26, 2001 at 15:34:08, Christopher R. Dorr wrote:
>>>
>>>>As I was reading the seemingly neverending discussion about computers being GMs
>>>>or not, one thing strikes me. The vast majority of people discuss only the
>>>>results of programs vs. GMs or other strong programs. Very few seem to focus on
>>>>their performance versus reasonable, but significantly weaker-than-GM opponents.
>>>>As a couple of examples, it seems that the majority of posters on here consider
>>>>Fritz 5 and Tiger 13 as GM strength computers on fast machines. Clearly, they
>>>>can hang with very good GMs on, say, a Celeron 800. If we look only at their
>>>>performance against a theoretical field of FIDE 2500 type GMS, these programs
>>>>would likely grab a performance rating in the neighborhood of 2500-2600, which
>>>>is reasonable to say 'GM strength'.
>>>>
>>>>What to make, however, of the notion that I, a random USCF 2100 can usually
>>>>score 1/8-1/4 against Tiger 13 on a Celeron 800. That equates to a rating
>>>>(against me) for Tiger of approximately USCF 2300-2400, which is clearly *not*
>>>>GM strength. While I rarely beat Tiger, I frequently draw it, at time controls
>>>>ranging from G/5 to G/30, at which one would suppose that a comp would be even
>>>>stronger than at 40/2. I have a very close friend who is also a USCF 2100, who
>>>>has a similar record against Fritz 5.
>>>>
>>>>When I had a copy of Chess genius a few years ago, this ability to draw it
>>>>almost at will was even more pronounced.
>>>>
>>>>So which is it? Is Tiger the GM program that can perform at a 2550 FIDE level
>>>>against GMs, or is it the USCF 2300 that it plays like against me?
>>>>
>>>>I have played several GMs in tournament play and at fast speeds on the internet.
>>>>I strongly doubt that I could get 1/8 or 1/4 against most decent GMs in a match,
>>>>yet I can fairly easily do that against many programs. If you do not believe me,
>>>>I'd be happy to show you multiple games against computers where their evaluation
>>>>said they were clearly winning, but in reality had drifted into a drawn R+P
>>>>ending or Bisop of opposite colour ending. Happens all the time.
>>>>
>>>>The main reason I posted this is to assert my position that we really *cannot*
>>>>say whether or not computers ar GMs. The way in which computers play does not
>>>>make that realistic yet. A computer will (in all liklihood) take a draw by
>>>>repition against me when down .15 just as it will against a GM. I know that you
>>>>can tune that by artificial means such as contempt bonuses and penalties, but
>>>>even with that, computers that I have seen *simply do not play like humans
>>>>play*, not only in terms of style, but also in terms of performance.
>>>>
>>>>If I played an 8 game match against GM Randomovich, and I scored 1.5, would we
>>>>call that a GM performance? Likely not. But if GM Randomovich plays in a
>>>>tournament and scores 4-4 against 2550 GMs, we would. A Computer certainly can
>>>>do the latter: but it *also* does the former with regularity. So, in reality, is
>>>>it *really* GM strength?
>>>>
>>>>Chris
>>>
>>>
>>> Very interesting reading. When I played actively, my national rating was
>>> about 2270-2300 - and that was several years ago. A while ago I tried
>>> a serie of 21 serious games vs 12 different GM personalities in CM8. They
>>> were played on a PIII 800 with 64 Mb hash for the program - 40 moves in
>>> 40 minutes. My preparation was absolutely none - no openings nothing.
>>> Simply played right from start. Also I had no interest in creating
>>> anti computer play. The result was exactly 1/3 meaning 7/21. When I look
>>> at these games I'm certain that my play was not above 2300 - but probably
>>> in the range of 2200-2300 somewhere. Some wins mixed with hard fought
>>> draws and stupid mistakes. With 14/21 for the program, is it then correct
>>> to assume that the program performed around 230 points better than me?
>>> If this is correct, and my rating is put to 2250 - CM performed around
>>> 2480. In 40/40.
>>
>>
>>14/23 is almost 2 of 3.  3 of 4 is roughly 200.  2/3 should be less than that...
>>
>>Bob
>>
>>(14/23 is 60% wins, for those wanting more accurate results.  That sounds like
>>maybe 100 rating points rather than 230 if I understood you correctly).
>
>
> It was 14/*21* for CM giving 66.7% - 33.3%. 11 wins 6 draws 4 losses.
> Is it then more accurat to calculate the difference in performance like
> 16.7 x 7 = 116.9 points?
>
> Sune
>>


I actually use a short "cheat sheet" I have at the office.  I have seen
several different formula-type approximations that are simple to calculate,
but I haven't saved any of them.  2/3 is fairly close to 3/4 of course...

Perhaps someone with real numbers handy will post the exact numbers.  In
any case, the basic premise here is interesting, because I have seen this
"draw" problem for a long time and worked on it.  If a GM wants a draw, it
is very difficult to avoid.  Ditto for IM players although there are a few
things to do.  Even masters can draw more than they should if the program
doesn't have some code to avoid certain types of pawn structures...







>>
>>
>>> On the other hand - if human player Jacques Tigér went to South America
>>> and performed what his cousin - the program - did, he would have got
>>> a GM-norm with 2 points margin! Results tell.
>>>
>>> Sune
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.