Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Are we ignoring basic math & statistics

Author: Christophe Theron

Date: 12:20:10 10/25/01

Go up one level in this thread


On October 25, 2001 at 13:28:19, José Carlos wrote:

>On October 25, 2001 at 11:30:04, Christophe Theron wrote:
>
>>On October 25, 2001 at 07:39:46, José Carlos wrote:
>>
>>>On October 25, 2001 at 06:57:29, Mike Hood wrote:
>>>
>>>>On October 24, 2001 at 22:03:49, Stephen A. Boak wrote:
>>>>
>>>>>On October 24, 2001 at 11:43:18, Joshua Lee wrote:
>>>>>
>>>>>>For Starters If Deep Fritz were that Magical 2700+ number Like the SSDF Claims
>>>>>>Then Huebner wouldn't have Drawn Every Game of their 6 game Match
>>>>>>Secondly With All Do Respect No Commercial Program Has Played As Many Humans As
>>>>>>The Deep Thought/Blue Programs and Also The Number of Games Vs. Rating Average
>>>>>>Is Unequal (Not as many games as Deep Thought) If you Suggest that programs are
>>>>>>So Strong why Then Hasn't One of the Top Commercial's Put up so much Money as to
>>>>>>Play Against a Top 10 Opponent and Not a Couple of Unknowns?
>>>>>>
>>>>>>Tiger Didn't Beat All GM's and I don't think they were very Strong GM's someone
>>>>>>even mentioned that Tiger was Lost in One Position. That may not say Much but I
>>>>>>would Consider Rebel's Achievement or Deep Junior's Much More Impressive.
>>>>>>Rebel because of So many Games against Strong and well Known GM's Like Rhode and
>>>>>>Scherbakov  and Deep Junior for Beating GM Leko and Heubner , Drawing Everyone
>>>>>>else Besides Kramnik and Lautier.
>>>>>>
>>>>>>8 Games are not really enough and 1 Tournament By no means makes a Computer a GM
>>>>>>, They Can't Get The Title anyway, I would Like for this to be a possibility
>>>>>>Then maybe someone would Try for their program to get it and we could Look to
>>>>>>FIDE instead of SSDF .   I hate that the list should be lowered by upto 200
>>>>>>points even by their own estimate the link is on their page.
>>>>>>
>>>>>>Another thing Tiger's Rating On an 866 Compared to the Speed Difference of the
>>>>>>SSDF would Still Point to the SSDF's Given Rating for Tiger to be Wrong.
>>>>>>
>>>>>>Tiger is 2703 on a 1200
>>>>>>While 2788 against an average 2497FIDE On a Slower 866 Hmm Somebody is wrong
>>>>>>Either all those players were lying about their rating or Could it be that the
>>>>>>SSDF Is Off ...
>>>>>
>>>>>Curiousity leads me to pose some questions to thoughtful posters:
>>>>>
>>>>>Ever hear of natural variation?  Do you think that a 2497 player plays at 2497
>>>>>strength (whatever that means) on each move, and across each game, no matter the
>>>>>day or time or opponent or how well he is feeling?
>>>>>
>>>>>Ever hear of the uncertainty of measurement?  What is the level of confidence
>>>>>that a 2497 player is *actually* (whatever that means) a 2497 strength player?
>>>>>
>>>>>Can you accept random chance (natural variation) as a reason for occasional
>>>>>exceptional results for programs or humans?
>>>>>
>>>>>Can you accept that measurements are all subject to some level of uncertainty,
>>>>>some level of confidence less than certainty?
>>>>>
>>>>>If so, the above statement (prior poster) makes little sense.
>>>>>
>>>>>If not, I understand the dilemma and recommend a good introductory book on
>>>>>statistics.
>>>>>
>>>>>Opinions are welcome, I have no problem with them.  But do posters investigate
>>>>>and try to learn about the subject they comment on, or are they curious to
>>>>>discover what they may be missing in their view of things?
>>>>>
>>>>>Math is not a solution to everything.  It is an often useful tool.  It both has
>>>>>its uses and its limitations.  But to ignore it completely seems silly.  Do
>>>>>posters know they ignore some basic uses of math (often statistics) when they
>>>>>post?  Do they care?
>>>>>
>>>>>Just curious.
>>>>>
>>>>>--Steve
>>>>
>>>>Thanks, Steve. I often have thoughts like yours when I read posts with titles
>>>>like "Beowulf is better than Deep Fritz on a 1.6 Ghz PC".
>>>>
>>>>What is the statistical background of the ELO rating system?
>>>
>>>  As I've asked some times: is there a good mathematical way to measure
>>>'strength'? What is 'strength' actually? Can anyone give a precise definition of
>>>'strength'? Without such a precise definition we can't draw any conclusion at
>>>all about players' strength. And if we want to draw mathematical conclusions, we
>>>need a mathematical definition.
>>>  IMO, measuring ELO rating (which is defined by a mathematical formula) is very
>>>different of measuring 'strength'.
>>
>>
>>
>>Strength is very clearly defined by the elo system. At least "relative strength
>>inside a pool of given players".
>>
>>If you have a better definition (as you do not seem to be convinced by the Elo
>>definition), feel free to submit yours...
>
>>    Christophe
>
>  I wish I had one, but I don't. As you said, Elo provides a good [enough]
>definition of "relative strength inside a pool of given players". That shows
>exactly the point I try to make: we can measure _realive_strength_ and
>_inside_a_pool_. And even that is debatable, because we are saying
>strength = results.
>  Ok, it's fine way to do it.
>  But it seems that some people talk about "absolute strength", and for that, we
>don't have a definition, AFAIK.



We could define absolute strength as the performance on a given set of
positions, but that would be arbitrary anyway.

The simplest definition is "the strongest wins more games". This defines
strength in term of "offset" between two or N players.



    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.