Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How strong is Chess Tiger on Palm ?

Author: Christophe Theron

Date: 08:44:22 11/10/01

Go up one level in this thread


On November 10, 2001 at 09:56:39, Thomas Mayer wrote:

>Hi Uri,
>
>>>> The right way to compute a rating would be to take the average rating of the
>>>> opponents and to count a +6 =0 -3 victory of Chess Tiger for Palm against
>>>> this virtual everage player.
>
>>> Well, Blikskottel has about 1750-1800 ELO and the used Golem version was
>>> around 1500 on those rating lists. I doubt that you would be very satisfied
>>> with the rating we can calculate out of this... :)
>
>>We can calculate no rating based on this information because we have not stable
>>rating for the program that beated palm tiger.
>
>oh my god, Uri, I really do not want to calculate any rating out of 9 games. It
>would be unfair to ChessTiger, think about the hardware difference. This example
>should just show that we must be vary careful what we compare when using
>different hardware platforms. Is it chess strength or just computer power...
>well, it is not ONLY computer power - but that does influent the results very
>much...




There is nothing new here. Doubling a computer's speed adds approximately 70 elo
points to its chess rating.

That could start to explain a lot of things about your idea of matching programs
on very unequal hardware.

There is such an elo difference that the error margin resulting of the match
makes any elo computation useless.





>> I also believe that tiger earn more from time relative to the weak programs so
>> we can get no idea about the ssdf rating of palm tiger based on this
>> information.
>
>Can I quote Christophe: programmers which argument that way are searching for an
>excuse of bad rating... I am not totally agree with that statement but he is not
>completely wrong with it.
>
>>> On the SSDF we have listed Fritz 5.0 with 2460 & Shredder 3 with 2417 on
>>> Pentium 200 MHz. Well, for testing issues I use sometimes an old Pentium II
>>> 233 MHz to test my own engine. Giving the better software the slower PC is
>>> interesting because those better engines must win now because of positional
>>> factors and can not outsearch my engine so easy. Well, but this PII/233 is
>>> now to slow, Quark (current beta) scores clearly above 50% against Fritz 5
>>> and Shredder 3 on PII/233 when Quark runs on Athlon 1333... What should we
>>> conclude out of that ? That Quark is over 2400 ??? I totally doubt that and
>>> I am quite happy that Quark has reached maybe 2300 nowadays... Results of
>>> the computer chess tourneys and on the winboard rating lists indicate that I
>>> am around 2300 ELO on the fast
>>> Athlon...
>
>> It is possible that quark(athlon) is better than 2400 against humans if you
>> use fast hardware.
>
>believe me, it is not, not near to 2400...
>
>> I do not know about the winboard lists but if they are based only on comp-comp
>> games I do not see how they get the 2300 when Fritz5(p200 has more than 2400
>> ssdf rating).
>
>It is based on winboard engines. What is so wrong with that ? No need to include
>any commercials, crosscomparisson is made with Crafty, SOS & Little Goliath...
>
>> I doubt if Fritz6(p233) can get 2400 against humans because humans have it
>> when they do not have quark.
>
>So is the rating estimation on the SSDF-list wrong ??? Only with Fritz or also
>with others ???
>
>But another example, due to a lack of time I have not tested that, but I am
>quite sure that it would happen that way:
>
>Take Quark on A1333 in the SSDF-list and let it play only against those programs
>on P200 and P90 - it may end up with something over 2500, who knows... Now let
>it play only against those programs with equal hardware. It will maybe end up
>below 2200...



No it will not.

Do you think you understand the elo system?

Don't forget that you can get the same elo rating by winning bigtime against
weaker opponents or losing bigtime against stronger opponents.

In the experiment you mention above, there is no reason to believe that you are
going to get an inconsistent rating for Quark depending on the strength of its
opponents.

Actually the SSDF has already done matches like these and they showed consistent
results.

And don't forget about the error margins of these matches.

You should be more careful about what you say. It's not easy to discuss with you
at this time because there are inconsistencies in what you say. On the other
hand you might have some interesting ideas to express, don't spoil them with
approximations like the above.



    Christophe




>So what is the right rating now ?
>So the problem is, that the different hardware-plattforms are totally out of
>sync on the SSDF-list. On the other hand I am quite sure that the P200 & P90
>engines may produce results against humans according to their rating. But if you
>start to test them much against the fast athlons they will never achieve the
>high rating they have on the list at the moment.
>
>Greets, Thomas



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.