Author: Thomas Mayer
Date: 05:35:31 11/10/01
Go up one level in this thread
Hi Christophe, > Yes, that's right. > There are 9 games posted on this link. > The 3 games Thomas is talking about are the only ones Chess Tiger for Palm > lost. > Chess Tiger for Palm won all the other games! > The right way to compute a rating would be to take the average rating of the > opponents and to count a +6 =0 -3 victory of Chess Tiger for Palm against this > virtual everage player. Well, Blikskottel has about 1750-1800 ELO and the used Golem version was around 1500 on those rating lists. I doubt that you would be very satisfied with the rating we can calculate out of this... :) And I doubt that it would be fair to ChessTiger to calculate a rating out of this - as I have said, I am quite sure that ChessTiger on Palm is around 2100. As you have explained, your testing does show it's strength and I believe you, of course. Anyway, where is the mistake ? Another example: (To get away from ChessTiger - I love your products, it was not my intension to offense you...) On the SSDF we have listed Fritz 5.0 with 2460 & Shredder 3 with 2417 on Pentium 200 MHz. Well, for testing issues I use sometimes an old Pentium II 233 MHz to test my own engine. Giving the better software the slower PC is interesting because those better engines must win now because of positional factors and can not outsearch my engine so easy. Well, but this PII/233 is now to slow, Quark (current beta) scores clearly above 50% against Fritz 5 and Shredder 3 on PII/233 when Quark runs on Athlon 1333... What should we conclude out of that ? That Quark is over 2400 ??? I totally doubt that and I am quite happy that Quark has reached maybe 2300 nowadays... Results of the computer chess tourneys and on the winboard rating lists indicate that I am around 2300 ELO on the fast Athlon... But per consequence: Are Fritz 5 and Shredder 3 weaker then it's rating on SSDF ? Where is the mistake ??? Or did they get weaker in the last years ? Without being changed ? Is that possible ??? And that's a problem of the SSDF-list - that's why I would like to see that SSDF skips the approach to make the list compareable with human ELO - it isn't. Speed of processors is more important in comp-comp comparisson then in human-comp comparisson, I remember a post of Bob Hyatt, where he points out that strong humans do not see a big difference if you use his fast Quad or his slower Quad. In comp-comp comparisson that might do a bigger difference. SSDF knows about that problem - that's why they have lowered the list once by 100 points but as we all know they was not very satisfied with it. And I believe that when we get to the next hardware level after Athlon 1200, maybe a processor with 3 GHz, it's the next point where they must lower the list by 100 ELOs or whatever to have good comparisson for the NEW top programs. Without trying to make it compareable to human ELO there would be no such problem at all. That's my real point. Greets, Thomas P.S.: SSDF do a great job in comparing different programs, there is no doubt. But ELO inflation is definitely a problem.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.