Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: ELO isn't a normal bell curve, without some transformation

Author: Uri Blass
Date: 14:36:26 06/06/02
On June 05, 2002 at 00:05:34, Dann Corbit wrote:

>On June 04, 2002 at 23:30:56, Stephen A. Boak wrote:
>
>>hi Dan,
>>
>>1. Since Elo's system defines (by design choice!) each specific rating
>>difference in terms of a specific expected scoring percentage, regardless of
>>where the two ratings fall on the scale, I suspect (but am not sure, not having
>>worked out the math yet on paper) that the simple plotting of ratings in a
>>histogram would not be a normal bell-shaped curve.
>
>The original question was whether or not computer strength was normally
>distributed.  This is a different random variable and is also normally
>distributed.  We can also plot win percentages multiplied by the opponent's
>strength just as well:
>
>SELECT int((win_percentage * opponent_strength)/5000), count((win_percentage *
>opponent_strength)/5000)
>FROM SSDF
>GROUP BY int((win_percentage * opponent_strength)/5000);
>
>Expr1000	Expr1001
>8	3
>9	3
>10	3
>11	6
>12	3
>13	12
>14	6
>15	9
>16	10
>17	15
>18	7
>19	12
>20	6
>21	18
>22	18
>23	20
>24	14
>25	11
>26	13
>27	13
>28	6
>29	10
>30	2
>31	4
>32	1
>33	7
>34	1
>
>Or (squished a bit more):
>
>SELECT int(([win_percentage]*[opponent_strength])/15000),
>count(([win_percentage]*[opponent_strength])/15000)
>FROM SSDF
>GROUP BY int(([win_percentage]*[opponent_strength])/15000);
>
>Expr1000	Expr1001
>2	3
>3	12
>4	21
>5	34
>6	25
>7	56
>8	38
>9	29
>10	7
>11	8
>
>>Wouldn't some transformation be required to convert such ratings into
>>'normalized' figures which *theoretically* might look more like a bell shaped
>>curve?
>
>There is a surprising range of curve shapes that still fit the gaussian model
>pretty well.
>
>>2. Over time, as new & improved program versions & ratings rise, due to advances
>>in chess programming algorithms & techniques (and hardware improvements,
>>perhaps), wouldn't the overall plotting of ratings on a histogram (roughly from
>>older, weaker programs to newer, stronger programs) more closely follow the
>>growth curve for average rating of each new crop of released program/hardware,
>>rather than the normal bell curve.
>
>Since they are different hardware setups or different program versions, they are
>treated as different organisms.  The method you suggest should only be used to
>model a single program, and then only changing one variable at a time (unless
>you intend to generate a surface)
>
>>3. Perhaps any program crop released within a relatively short span of time (say
>>a year or so) would have ratings plottable (with transformation, as noted above)
>>that closely approximate the normal (bell) curve.
>
>I think probably the leptokurtotic shape is a function of reality.  In other
>words, if a program is dominatingly better, nobody would buy the others.  If a
>program is dominatingly weak, then nobody will buy it.  So they are forced to be
>fairly close in ability.  There is a broad mass with nearly equal ability and a
>few outliers with exceptional strength or weakness.
>
>In other words, if someone wrote a 3000 ELO program, it would be the only one
>that people bought and got tested and we would see a spike.

No

If the price of the 3000 elo program will be 3000$ then I suspect that a lot of
people will prefer to buy a program that is more than 200 elo weaker for 50$.

Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.