Computer Chess Club Archives


Search

Terms

Messages

Subject: Yace tablebase / bitbase test continued (long post)

Author: Robert Allgeuer

Date: 13:09:50 07/21/03



I have extended my recent bitbase and tablebase tests (posted at
http://f11.parsimony.net/forum16635/messages/49266.htm), with the goal to look
at, what influence probing tablebases less aggressively has on playing strength.
For those who have not read the old posting: In short, the result of my previous
test was that at Blitz time controls (300+2) there is no measureable increase of
playing strength due to 5 piece tablebases and/or bitbases. Also another
previous test with Crafty 19.1 at rapid time controls (1200+5) gave the same
result.



Method:
=======

As in the last test, Yace Paderborn was used. Compared to the previous test
three additional configurations of Yace were introduced that probe tablebases
less aggressively. The playing strength of the six versions was compared in
self-play and in gauntlet tournaments against other strong free winboard
engines: A round-robin tournament of the six Yace configurations with 100
subrounds was run, then each of the Yace configurations played a gauntlet
tournament with 20 subrounds against a series of other strong free chess
engines, some of them with and some of them without tablebases. Time control was
300+2, for further details on platform, method and used tools please refer to
the posting referenced above.



Participants:
=============

1) Yace Paderborn: version with both bitbases and tablebases (3, 4 and 5 piece).
Large opening book from Yace website.

2) Yace Paderborn nobb: version with tablebases only, no bitbases configured.
Otherwise identical to above.

3) Yace Paderborn notbbb: version with neither tablebases nor bitbases.
Otherwise same configuration as above.

4) Yace Paderborn tbu0: version that probes tablebases only at the root-node
(tb_usage 0), but uses bitbases in the whole search-tree. Otherwise same
configuration as above.

5) Yace Paderborn tbu1: version that probes tablebases only until depth/2
(tb_usage 1), but uses bitbases in the whole search-tree. Otherwise same
configuration as above.

6) Yace Paderborn tbu1nobb: version that probes tablebases only until depth/2
(tb_usage 1), no bitbases configured. Otherwise same configuration as above.


Yace.ini file (example for tbu1 version):
hash 96M
egtb_cache 8M
tbldir f:\tb
bbpath .
tb_usage 1
pos_learn 0
book_learn 4
; to disable book learning, I suggest
; book_learn 4
; Yes - 4 and not 0, so that Yace will have access to NAGs in dblearn.bin
resign off



Results:
========

Resulting overall rating (5410 unique games, including also matches between the
opponents):

    Program                          Elo    +   -   Games   Score   Av.Op.
Draws

  1 Ruffian v1.0.1                 : 2627   25  38   417    69.4 %   2485   23.3
%
  2 Crafty v17.14DC                : 2571   28  29   418    61.8 %   2487   30.9
%
  3 Aristarch v4.21                : 2547   33  36   320    60.2 %   2475   22.2
%
  4 Yace Paderborn tbu1            : 2540   24  18   739    53.6 %   2515   38.2
%
  5 Yace Paderborn notbbb          : 2538   24  17   739    53.2 %   2516   40.1
%
  6 Yace Paderborn nobb            : 2529   25  17   737    51.8 %   2517   40.4
%
  7 Pepito v1.59 profile           : 2525   31  28   420    55.1 %   2489   25.0
%
  8 Yace Paderborn tbu1nobb        : 2521   25  17   739    50.5 %   2518   40.1
%
  9 Yace Paderborn tbu0            : 2520   25  17   739    50.3 %   2518   40.6
%
 10 Yace Paderborn                 : 2519   24  17   819    51.5 %   2509   35.8
%
 11 Little Goliath 2000 v3.9       : 2511   32  27   420    53.1 %   2490   25.7
%
 12 Aristarch v4.4                 : 2508   33  29   419    52.5 %   2490   20.0
%
 13 SoS 3                          : 2501   33  27   419    51.6 %   2491   23.4
%
 14 Green Light Chess v3.00        : 2494   34  26   420    50.5 %   2491   27.6
%
 15 Pharaon v2.62                  : 2478   27  33   420    48.1 %   2492   24.3
%
 16 Crafty v19.01DC                : 2450   28  31   418    44.0 %   2492   26.8
%
 17 LambChop v10.99                : 2437   30  30   419    42.0 %   2493   24.3
%
 18 Tao v5.4                       : 2425   33  29   420    40.2 %   2494   18.6
%
 19 Amy v0.8.3                     : 2412   38  33   319    39.8 %   2484   18.8
%
 20 Francesca M.0.0.9              : 2409   35  33   320    39.4 %   2484   26.2
%
 21 Comet B60                      : 2395   33  27   420    36.0 %   2495   24.3
%
 22 SlowChess v2.78                : 2338   48  29   319    29.6 %   2488   16.6
%


The six different Yace configurations scored within 21 ELO points, so that the
test shows no statistically measureable impact of tablebases and bitbases on the
playing strength of Yace, regardless whether they are probed less or more
aggressively or are combined with bitbases or not.


I also looked at whether there are differences with respect to at which stage a
game is won. For that purpose (resign was off where possible) I divided the
games into three classes: games that have reached positions with 5 or fewer
pieces (i.e. they reached positions contained in the 5 piece EGTBs), games that
are 60 or fewer moves long (i.e. games that probably have already been decided
during middlegame) and finally  those games that are longer than 60 moves, but
did not reach positions with 5 or fewer pieces. The idea is that the last group
of games are those that were with a high probability decided during the endgame,
where probing tablebases and bitbases during the search plays an important role.


Rating taking into account only those games that at some stage reached positions
with 5 or fewer pieces:

    Program                          Elo    +   -   Games   Score   Av.Op.
Draws

  1 Ruffian v1.0.1                 : 2663   59  62    78    73.7 %   2484   42.3
%
  2 Pepito v1.59 profile           : 2641   70  47    59    72.0 %   2476   55.9
%
  3 Comet B60                      : 2609   69  32    67    67.2 %   2484   65.7
%
  4 Pharaon v2.62                  : 2594   58  49    92    67.4 %   2468   47.8
%
  5 Aristarch v4.21                : 2558   77  54    59    63.6 %   2462   52.5
%
  6 Yace Paderborn                 : 2557   41  26   218    59.2 %   2492   56.0
%
  7 Yace Paderborn nobb            : 2540   46  24   198    54.8 %   2507   62.1
%
  8 Aristarch v4.4                 : 2538   80  47    66    56.8 %   2490   56.1
%
  9 Yace Paderborn tbu1            : 2528   47  25   198    53.8 %   2501   60.1
%
 10 Amy v0.8.3                     : 2523   90  51    55    55.5 %   2485   56.4
%
 11 Crafty v17.14DC                : 2506   67  35   101    53.5 %   2482   59.4
%
 12 SoS 3                          : 2501   80  44    78    50.6 %   2496   52.6
%
 13 Yace Paderborn tbu1nobb        : 2499   25  46   210    47.4 %   2517   57.6
%
 14 Yace Paderborn tbu0            : 2494   24  44   226    46.9 %   2515   58.4
%
 15 Little Goliath 2000 v3.9       : 2493   35  79    80    49.4 %   2498   66.2
%
 16 Green Light Chess v3.00        : 2464   32  65    95    43.2 %   2511   65.3
%
 17 Crafty v19.01DC                : 2454   44  76    74    43.9 %   2497   55.4
%
 18 Yace Paderborn notbbb          : 2442   26  39   222    38.7 %   2522   55.0
%
 19 Tao v5.4                       : 2402   59  61    90    36.1 %   2501   36.7
%
 20 LambChop v10.99                : 2368   57  60    85    32.9 %   2492   42.4
%
 21 SlowChess v2.78                : 2325   97  74    51    25.5 %   2512   31.4
%
 22 Francesca M.0.0.9              : 2288   76  50    94    21.3 %   2515   31.9
%


This list shows the engines and Yace versions without tablebase support with
considerably weaker ratings than their overall rating, while engines with
tablebase support lead the pack. Also less aggressively probing Yace versions
have a lower rating than more aggressively probing ones. For Yace the difference
between the version with no tablebase and bitbase support and full support is
115 ELO points and statistically significant.


Rating taking into account only games longer than 60 moves that did not reach
positions with 5 or fewer pieces:

    Program                          Elo    +   -   Games   Score   Av.Op.
Draws

  1 Little Goliath 2000 v3.9       : 2632   40  62   167    70.1 %   2485   20.4
%
  2 Green Light Chess v3.00        : 2598   41  58   178    65.7 %   2485   16.9
%
  3 Ruffian v1.0.1                 : 2590   41  51   187    64.4 %   2486   23.0
%
  4 Crafty v17.14DC                : 2571   43  51   182    62.1 %   2486   20.9
%
  5 Yace Paderborn notbbb          : 2567   33  35   324    59.3 %   2502   24.7
%
  6 Aristarch v4.4                 : 2552   45  53   175    60.0 %   2482   16.0
%
  7 Pepito v1.59 profile           : 2545   43  42   214    56.8 %   2498   22.0
%
  8 Aristarch v4.21                : 2522   58  62   116    57.3 %   2471   16.4
%
  9 Pharaon v2.62                  : 2519   52  49   156    54.5 %   2488   19.2
%
 10 Yace Paderborn tbu0            : 2517   38  30   334    50.7 %   2511   23.7
%
 11 Yace Paderborn tbu1            : 2515   37  31   338    50.9 %   2509   21.3
%
 12 Yace Paderborn tbu1nobb        : 2513   38  30   339    50.3 %   2511   22.1
%
 13 Yace Paderborn nobb            : 2510   29  37   350    49.9 %   2511   23.7
%
 14 LambChop v10.99                : 2509   53  45   162    52.5 %   2492   23.5
%
 15 Yace Paderborn                 : 2473   31  32   392    44.5 %   2511   21.7
%
 16 Francesca M.0.0.9              : 2444   53  53   142    43.0 %   2493   21.1
%
 17 SlowChess v2.78                : 2441   56  56   128    43.4 %   2487   19.5
%
 18 SoS 3                          : 2433   49  41   208    39.2 %   2510   16.8
%
 19 Crafty v19.01DC                : 2410   46  37   239    37.4 %   2499   18.0
%
 20 Amy v0.8.3                     : 2389   63  41   180    33.6 %   2507   10.6
%
 21 Comet B60                      : 2369   56  37   203    31.5 %   2504   18.7
%
 22 Tao v5.4                       : 2365   61  37   204    31.1 %   2503   13.2
%

This rating list shows those engines and Yace versions ahead that do not probe
tablebases, even SlowChess and Francesca are ahead of e.g. SoS and Crafty 19.1.
For Yace probing bitbases and tablebases results in a statistically significant
decrease of its rating by 94 ELO points for such games.


Obviously (and logically) engines probing tablebases tend to play into won 5
piece positions (and avoid lost 5 piece positions), hence their high rating in
such games. However, in the preceding endgame phase with still more pieces on
the board tablebases somehow must be a sort of liability, so that engines using
them lose games during this phase over proportion. Apparently statistically for
each game they successfully manoeuver into a won 5 piece position, they also
lose another one during the preceding endgame, so that overall this effect more
or less cancels out the advantages of tablebases. I doubt that this is a
tactical advantage, because even aggressively probing engines that use only
little CPU time at this stage still search deeper than engines without tablebase
support.

As a conclusion it appears that tablebases and bitbases are no real advantage
(which was also the result of previous experiments), although it is still not
clear - at least for me - why this is so. Any ideas?

Robert



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.