Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Book and EGTB self play test (long post)

Author: Robert Allgeuer

Date: 01:41:18 02/04/03

Go up one level in this thread


On February 03, 2003 at 17:46:19, Dann Corbit wrote:

[...]
>>
>>Platform and Tools:
>>===================
>>
>>Athlon Thunderbird 1.1MHz
>>512 MB RAM
>>Windows 2000
>>
>>Crafty 19.1DC
>>BookThinker 4.1c
>>BookMaker 1.0b
>>Elostat 1.1b
>>PGN-Extract 15.0
>>Winboard 4.2.3
>>WB Tourney Manager 0.60 (Jori Ostrovskij)
>>PGN-Sammler
>>
>>
>>Results:
>>========
>>
>>    Program           Elo    +   -   Games   Score   Av.Op.  Draws
>>
>>  1 Crafty_lgbk_egtb: 2517   40  23   281    52.8 %   2497   53.0 %
>>  2 Crafty_aobt_egtb: 2510   40  23   287    51.7 %   2498   51.9 %
>>  3 Crafty_smbk_notb: 2507   42  24   270    51.1 %   2499   51.1 %
>>  4 Crafty_lgbk_notb: 2507   41  22   277    51.1 %   2499   58.1 %
>>  5 Crafty_aobk_egtb: 2490   22  40   283    48.4 %   2501   55.1 %
>>  6 Crafty_aobk_notb: 2486   24  41   271    47.6 %   2502   51.7 %
>>  7 Crafty_nobk_notb: 2482   23  40   275    47.1 %   2503   53.5 %
>>
>>From this the following can be concluded:
>>
>>The increase of playing strength due to a complete set of 5 piece EGTBs is
>>with 95% confidence equal or less than 67 ELO points.
>
>I don't see why you would bother with the number 67.  You may as well pick any
>number.  With NO EGTB, Crafty_smbk_notb and Crafty_lgbk_notb were within 10 ELO
>of the best finisher: Crafty_lgbk_egtb.  You are missing Crafty_smbk_egtb and so
>we cannot conjecture about the difference between having/not-having egtb for the
>small book.
>

What I do is to calculate ceilings with 95% confidence:
for aobk: 4 + 41 + 22 = 67, the upper ceiling of the difference between egtb
and notb. For lgbk the same calculation is 72, so it might have been more exact
to quote the worst case here, i.e. 72, but in any case the ceiling for egtbs
is - according to these tests for this configuration - in the order of 70.
Only with 5% probability the difference would be higher than that, but it is
likely that the difference is smaller.
The same calculation for the book between nobk and lgbk gives 107, also
a ceiling with 95% confidence.


>> The results suggest
>>that the increase is even significantly smaller than that, if not almost
>>negligable.
>
>This would be my best guess.  However, the uncertainty in the data is enough
>that I think any sort of conclusion is questionable.
>

In my view one can make a statement about the upper ceilings with a certain
confidence, here 95%.

[...]

>
>Because you had learning enabled, I think the results are very hard to fathom.
>The reason I say that is that how a book is interpreted will change over time.
>Also, bad lines will be considered in the learning.  A version without any
>opening book may use this learning data and in essense has not only an opening
>book, but also a corrected one.
>
>It looks to me like we might say the following:
>
>It appears unlikely that EGTB files or computer generated opening books will add
>several hundred ELO to the crafty chess program when run under an autoplayer at
>20min+5sec on {unknown hardware configuration} with 96 MB hash and 6 MB pawn
>hash.
>
>It seems likely that an opening book is more valuable than a set of EGTB files
>for playing strength, but that is uncertain.
>

I think those are all valid conclusions. I have actually specified the HW
configuration in my original post: Athlon 1.1 MHz, Win 2K, 512 MB RAM.

I knew that turning on learning would be challenged (please note that only
book learning was enabled):
without it there are too many duplicate games, so there is no way to run
such a test with that many games. I would see book learning for this still
relatively low number of games per copy as additional "random generator" in
the book selection process.
I am sorry, but I do not understand your point on the version without opening
book using learning data.

The reason why smbk_egtb (and nobk_egtb) are missing is that they
produce too many duplicate games when playing against their likes without
egtbs. Furthermore I had to make a trade-off between number of participants
and number of subrounds and decided in favour of having more subrounds rather
than losing time with duplicate games. I thought and still think - although
assumption - that it is safe to assume that the effect of egtbs on playing
strength is independent from the book. Also in the test differences between
egtb and notb for aobk and lgbk ceilings are very close (67 and 72).

It would be interesting to know how sensitive the results would be on the
factors you mention (time control, hash size, computing power and chess
program), but that would be another set of long-running tests.

>Your conclusions (since they are ceilings) may be fairly safe.  But learning
>adds another monkey-wrench.

Robert




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.