Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: ply search vs elo rating - proposed formula

Author: Jeremiah Penery

Date: 06:19:11 10/21/99

Go up one level in this thread


On October 21, 1999 at 06:52:53, Graham Laight wrote:

>On October 20, 1999 at 19:27:26, Jeremiah Penery wrote:
>
>>On October 20, 1999 at 16:27:23, Joshua Lee wrote:
>>
>>>>>>>This would yield the following results:
>>>>>>>
>>>>>>>Ply     Elo Rating
>>>>>>>===     ==========
>>>>>>>
>>>>>>>2       1098
>>>>>>>4       1386
>>>>>>>6       1635
>>>>>>>8       1855
>>>>>>>10      2052
>>
>>> ????????? Awhile back i posted where hiarcs played the exact first 6 moves of > two different IM's for whites first six moves and blacks first 5, this is too > low.
>>
>>Hiarcs searches exactly 10-ply?  Also, just because 6 moves were the same, does
>>not mean _all_ the moves were the same.  Further, these numbers were figured
>>using K=0.15 for the chess-knowledge.  Hiarcs may have either more or less, and
>>so it is impossible to tell actual numbers for any program.
>>
>>>>>>>12      2230
>>>>>>>14      2392
>>
>>> Again this is too low Deep Blue was atleast 2650 searching 14ply
>>
>>Deep Blue did not search only 14-ply.  It did 14-ply brute-force, plus 30+ ply
>>of extensions in most of the interesting lines.  Not to mention that DB would
>>probably have a K-value of greater than 0.15, which would make all the ratings
>>on this list go up for each depth.  However, I suspect that the ratings may
>>level out a bit at the top, because the maximum theoretical ELO rating is
>>somewhere around 3000, I believe.
>>
>>>>>>>16      2542
>>>>>>>18      2680
>>
>>> That makes these atleast 300 or more elo points too low
>>
>>See above.
>>
>>>>>>>20      2809
>>>>>>>22      2929
>>
>>>Otherwise good idea rework the numbers and we have a good indicator of strength
>>>vs ply.
>>>also at 1 ply Hiarcs was winning 3 out of 4 games against me so maybe at 1 ply
>>>it is 1700 but again this is all relative to the time control. Are we talking
>>>3min per move?
>>
>>Again, 1-ply doesn't necessarily mean exactly 1-ply.  It means 'at least' one
>>ply.  There are always extensions and quiescence search to achieve greater
>>apparent depth.
>>
>>Jeremiah
>
>Firstly, one of you said that there was a theoretical max elo rating of 3000.
>Presumably, this is calculated by correlating elo rating with proportion of
>draws achieved between 2 players of that level. Is this correct? Has anyone done
>this work.

I'll try to look up where I found this... It wasn't exactly 3000, I think, but
somewhere around that number.

>Secondly, the time the computer takes theoretically doesn't matter (though it is
>well known that if a computer plays quickly, the human opponent tends to play a
>lot worse).
>
>Thirdly, in view of the estimated figures for Hiarcs shown above, I have decided
>to modify Laight's equation as follows:


Hiarcs may simply have a greater K than 0.15, which you were using before.  DB
almost certainly does.  Although the new formula looks pretty good. :)


>Laight's Equation
>=================
>
>Version 2: 21/10/99
>
>elo rating = log((Ply * K * C1) + C2) * C3
>
>Where ply = ply search depth
>K = Knowledge Level
>C1, C2, C3 are constants.
>
>The extra constant, C1, is necessary to compress the range of results being
>produced.
>
>As before, K is calculated as follows:
>
>Kn = % of all the useful chess knowledge the program has
>K = Kn/(100 - Kn)
>
>If C1 = 0.1, C2 = 1.3, C3 = 13500, and K = 0.2, this yields the following
>results:
>
>Ply 1 elo = 1628
>Ply 10 elo = 2377
>
>This is well in line with the numbers Joshua Lee suggested above.
>
>If Laight's equation applied to Hiarcs and Deeper Blue is accurate, it implies a
>K of 0.2, which would mean that both computers have about 17% of all the useful
>chess knowledge.

DB has a lot more knowledge than Hiarcs does.  Also, DB was searching a lot more
than the 14-ply that Joshua was saying - it searched 14 + 30(+) ply in most
interesting lines.  It's really difficult to determine an accurate 'depth' most
of the time, because for each program it will be different, based on the number
of extensions done.  Hiarcs probably extends more than most other micro
programs, but DB does a lot more even.

Unfortunately, this formula probably won't work for humans, because it's
impossible to determine a search depth.  Therefore we can't use humans as a
gauge for the accuracy of the 'K'.

I figured for myself that I have only 5% of chess knowledge (K=0.05), and that I
search 2-ply.  Based on this, I got a rating of 1583.  This is a bit too high, I
think. :)

Jeremiah



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.