Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Poll Question ? { Dream Match }

Author: Robert Hyatt

Date: 10:28:49 01/07/00

Go up one level in this thread


On January 07, 2000 at 13:21:26, Bertil Eklund wrote:

>On January 07, 2000 at 12:40:09, Albert Silver wrote:
>
>>On January 07, 2000 at 10:05:51, Graham Laight wrote:
>>
>>>On January 07, 2000 at 08:42:30, Albert Silver wrote:
>>>
>>>>On January 06, 2000 at 19:47:10, Graham Laight wrote:
>>>>
>>>>>On January 06, 2000 at 17:20:44, Robert Hyatt wrote:
>>>>>
>>>>>>On January 06, 2000 at 10:43:29, Graham Laight wrote:
>>>>>>
>>>>>>>On January 06, 2000 at 10:23:46, Robert Hyatt wrote:
>>>>>>>
>>>>>>>>It is more than anecdotal.  There is no contrary evidence at all, so far, other
>>>>>>>
>>>>>>>I don't agree - I think that the SSDF list represents "evidence", because they
>>>>>>>have long experience of every level of play the computers have reached since
>>>>>>>1984 or 1985.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>What does the SSDF rating list have to do with whether a computer is at a GM
>>>>>>level or not?  You could add or subtract 400 points from every rating on their
>>>>>>list, and things would still be just as valid according to the Elo formula.
>>>>>>The 'spread' between two programs on the SSDF list is correct.  The absolute
>>>>>>value of the ratings are over-inflated.  Or do you believe that a computer is
>>>>>>really playing at 2700 and is in the top 10 in the world?
>>>>>>
>>>>>>I don't...
>>>>>
>>>>>If the 'spread' is correct, then the absolute values must also be right -
>>>>>because the SSDF list is known to correlate well with FIDE ratings UP TO A
>>>>>CERTAIN LEVEL (though it is admitted to be 20-30 points too high).
>>>>
>>>>Oh? What level? Which program correlates to any FIDE rating? For that matter
>>>>which program has a FIDE rating to correlate to, or which human has a SSDF
>>>>rating to compare to his FIDE rating? The last time they correlated to human
>>>>ratings as far as I know was back in 1990 or so, when the Novag Par Excellence
>>>>was rated 1850 in France after testing it in 40 games at 40/2 against human
>>>>players and the SSDF had it at 1834 (something like that), and the Fidelity Mach
>>>>III was rated at 2036 in France (same conditions) and the SSDF had it at 1993.
>>>
>>>Thanks for this information, which I regard as supportive, because it shows that
>>>the SSDF had rated the computers quite accurately!
>>>
>>>Since then, they've had years of extra experience to help them grapple with the
>>>problems of "getting it right".
>>
>>What exactly was done about "getting it right" as you say? To my knowledge
>>nothing.
>>
>>
>>>
>>>>Of course, the SSDF also organized games against humans back then and included
>>>>these in the rating list. Still, there weren't any FIDE ratings below 2200 then
>>>>either.
>>>>You also mention that it was ADMITTED to be 20-30 points over-rated. Admitted
>>>>implies that someone is in possession of incontrovertible information. I don't
>>>>think the SSDF possesses ANY information to make such a statement.
>>>>
>>>>>
>>>>>I think that you are saying that, relative to the FIDE ratings, the spread is
>>>>>too great at the high end.
>>>>>
>>>>>If it is true that the SSDF ratings correlate well with the FIDE ratings up to,
>>>>>say, 2400 points (which probably is true), then what I think you are telling me
>>>>>is that, for those computers above 2400 on the SSDF list, the gap between them
>>>>>is too big, and that therefore the higher you get on the SSDF list, the more
>>>>>overinflated the scores are, relative to human players.
>>>>>
>>>>>>>>than 'opinion polls'.  Let's watch the Rebel games.  That will be a reasonable
>>>>>>>>guage...
>>>>>>>
>>>>>>>Certainly. Even better if the SSDF take up Ed's offer to test Rebel Century.
>>>>>>>
>>>>>>>-g
>>>>>>
>>>>>>
>>>>>>That doesn't help a bit for the SSDF rating numbers.  Their rating pool of
>>>>>>players has nothing whatsoever to do with FIDE, so the ratings can't be compared
>>>>>>at all.  If they wanted, they could take rebel-10's eventual TPR as a real FIDE
>>>>>>rating, then enter Rebel into the SSDF testing cycle, and when it finishes,
>>>>>>reduce everyone's rating by X so that rebel's SSDF rating matches its TPR rating
>>>>>
>>>>>Agreed.
>>>>
>>>>I disagree. You will only be prolonging the problem and will eventually get back
>>>>to the situation we have now.
>>>
>>>Hence we have to try to make the best of what information we do have.
>>>
>>>>>
>>>>>>for the GM challenge matches.  I think that X will be 200 points or more, IMHO.
>>>>>
>>>>>In my opinion, which is equally humble (of course!), Tiger's FIDE rating is
>>>>>probably about 2660 - I don't think that this is quite in the top 10.
>>>>
>>>>On what is your opinion based? My opinion is different but not based on any
>>>>scientific knowledge or testing. Merely my observation of it's play, and what it
>>>>knows and doesn't. If it's playing 2660, it's the most ignorant 2660 I ever saw.
>>>>
>>>>                                       Albert Silver
>>>>
>>>
>>>2666 = 2696 (SSDF rating) - 30 (to convert from SSDF to FIDE scale)
>>
>>There is no conversion scale. I readily accept that the Fidelity Par Excellence
>>is 1835 as this was backed up by testing against human players, but the rest is
>>pure extrapolation. Here is how Chess Tiger's 2696 (-30 to get the FIDE rating
>>of course) was achieved (very roughly as there were more computers involved but
>>the system is the same):
>>
>>Mephisto MM4 beat the Par Excellence (1835) 12.5-7.5 and was thus rated 1904.
>>Mephisto Roma 68000 beat the MM4 (1904) 19-9 and was thus rated 1970.
>>Fidelity Mach III beat the Roma 68000 (1970) 139.5-96.5 and was thus rated 1993.
>>Mephisto Lyon 68020 beat the Mach III (1993) 19-8 and was thus rated 2150.
>>Fritz 3 on a 486/66 beat the Lyon (2150) 13-7 and was thus rated 2257.
>>Genius 2.0 on a 486/66 beat Fritz3 (2257) 12-9 and was thus rated 2336.
>>Hiarcs 4 on a P90 beat Genius 2.0 (2336) 11-9 and was thus rated 2392.
>>
>>[Note that no humans have anything to do with this]
>>
>>Rebel 8.0 on a P90 beat Hiarcs 4 (2392) 11.5-8.5 and was thus rated 2438.
>>Mchess Pro 8.0 on a P200MMX beat Rebel 8.0 (2438) 12-8 and was thus rated 2492.
>>Junior 5 on a P200MMX beat MCPro 8.0 (2492) 14.5-9.5 and was thus rated 2542.
>>Chess Tiger 12 on a K6-2/450 beat Junior 5 (2542) 31.5-14.5 and was thus rated
>>2696.
>>
>>Conclusion:
>>
>>We can now confidently say Chess Tiger 12 is about 2666 FIDE (minus the 30
>>extraneous points so kindly admitted by the SSDF), which is a little stronger
>>than Victor Korchnoi, Judit Polgar, Yasser Seirawan, and World FIDE champion
>>Alexander Khalifman, and just a few points shy of Peter Svidler, Nigel Short,
>>Boris Gelfand and Anatoly Karpov, BECAUSE:
>>
>>it beat Junior 5 which beat Mchess Pro 8 which beat Rebel 8.0 which beat Hiarcs
>>4.0 which beat Genius 2.0 which beat Fritz 3 which beat the Mephisto Lyon 68020
>>which beat the Fidelity Mach III which beat the Mephisto Roma 68000 which beat
>>the mephisto MM4 which beat the Fidelity Par Excellence which was rated 1835
>>back in 1989!
>>
>>Yes!!!!
>>I see it now!
>>It is all so clear!
>>Enlightenment!!!!!
>>
>>                                     Albert Silver
>>
>>
>>Hi!
>
>Very good post!
>
>I play in a chessclub with players from 1200 to 2500
>We still basically use the same rating-system. 1400 beats 1200, 1600 beats 1400,
>1800 beats 1600 and so on.
>
>If you have a rating is it a gift from God or what?
>
>Regards Bertil
>>


finally you are looking at the right data.  1600-1400 should say that the
1600 player wins 3 of 4 games.  Add 1000 to each of those ratings.  The 2600
still beats the 2400 player 3 of every 4 games.  That is the point.  The SSDF
scale is grossly inflated, but the spread between programs is right, based on
the computer vs computer games.  It is a given that if you took the top 5
programs and played them in FIDE events, the order, and ratings would likely be
different, _particularly_ the absolute values of the ratings.  They would be
far lower.  Forget the idea that tiger is almost 2700 FIDE.  It just ain't so.
It is 2700 in the SSDF rating pool, of course, and it is correct there.  But
that 2700 has zilch to do with a 2700 in FIDE.





>>
>>>
>>>If Tiger is ignorant, then so was DB in 4 of the 6 games it played against GK in
>>>'97 - but it still achieved an MPR of around 2900. Maybe ignorance is not so bad
>>>in a computer.
>>>
>>>-g



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.