Author: Robert Allgeuer
Date: 13:09:50 07/21/03
I have extended my recent bitbase and tablebase tests (posted at
http://f11.parsimony.net/forum16635/messages/49266.htm), with the goal to look
at, what influence probing tablebases less aggressively has on playing strength.
For those who have not read the old posting: In short, the result of my previous
test was that at Blitz time controls (300+2) there is no measureable increase of
playing strength due to 5 piece tablebases and/or bitbases. Also another
previous test with Crafty 19.1 at rapid time controls (1200+5) gave the same
result.
Method:
=======
As in the last test, Yace Paderborn was used. Compared to the previous test
three additional configurations of Yace were introduced that probe tablebases
less aggressively. The playing strength of the six versions was compared in
self-play and in gauntlet tournaments against other strong free winboard
engines: A round-robin tournament of the six Yace configurations with 100
subrounds was run, then each of the Yace configurations played a gauntlet
tournament with 20 subrounds against a series of other strong free chess
engines, some of them with and some of them without tablebases. Time control was
300+2, for further details on platform, method and used tools please refer to
the posting referenced above.
Participants:
=============
1) Yace Paderborn: version with both bitbases and tablebases (3, 4 and 5 piece).
Large opening book from Yace website.
2) Yace Paderborn nobb: version with tablebases only, no bitbases configured.
Otherwise identical to above.
3) Yace Paderborn notbbb: version with neither tablebases nor bitbases.
Otherwise same configuration as above.
4) Yace Paderborn tbu0: version that probes tablebases only at the root-node
(tb_usage 0), but uses bitbases in the whole search-tree. Otherwise same
configuration as above.
5) Yace Paderborn tbu1: version that probes tablebases only until depth/2
(tb_usage 1), but uses bitbases in the whole search-tree. Otherwise same
configuration as above.
6) Yace Paderborn tbu1nobb: version that probes tablebases only until depth/2
(tb_usage 1), no bitbases configured. Otherwise same configuration as above.
Yace.ini file (example for tbu1 version):
hash 96M
egtb_cache 8M
tbldir f:\tb
bbpath .
tb_usage 1
pos_learn 0
book_learn 4
; to disable book learning, I suggest
; book_learn 4
; Yes - 4 and not 0, so that Yace will have access to NAGs in dblearn.bin
resign off
Results:
========
Resulting overall rating (5410 unique games, including also matches between the
opponents):
Program Elo + - Games Score Av.Op.
Draws
1 Ruffian v1.0.1 : 2627 25 38 417 69.4 % 2485 23.3
%
2 Crafty v17.14DC : 2571 28 29 418 61.8 % 2487 30.9
%
3 Aristarch v4.21 : 2547 33 36 320 60.2 % 2475 22.2
%
4 Yace Paderborn tbu1 : 2540 24 18 739 53.6 % 2515 38.2
%
5 Yace Paderborn notbbb : 2538 24 17 739 53.2 % 2516 40.1
%
6 Yace Paderborn nobb : 2529 25 17 737 51.8 % 2517 40.4
%
7 Pepito v1.59 profile : 2525 31 28 420 55.1 % 2489 25.0
%
8 Yace Paderborn tbu1nobb : 2521 25 17 739 50.5 % 2518 40.1
%
9 Yace Paderborn tbu0 : 2520 25 17 739 50.3 % 2518 40.6
%
10 Yace Paderborn : 2519 24 17 819 51.5 % 2509 35.8
%
11 Little Goliath 2000 v3.9 : 2511 32 27 420 53.1 % 2490 25.7
%
12 Aristarch v4.4 : 2508 33 29 419 52.5 % 2490 20.0
%
13 SoS 3 : 2501 33 27 419 51.6 % 2491 23.4
%
14 Green Light Chess v3.00 : 2494 34 26 420 50.5 % 2491 27.6
%
15 Pharaon v2.62 : 2478 27 33 420 48.1 % 2492 24.3
%
16 Crafty v19.01DC : 2450 28 31 418 44.0 % 2492 26.8
%
17 LambChop v10.99 : 2437 30 30 419 42.0 % 2493 24.3
%
18 Tao v5.4 : 2425 33 29 420 40.2 % 2494 18.6
%
19 Amy v0.8.3 : 2412 38 33 319 39.8 % 2484 18.8
%
20 Francesca M.0.0.9 : 2409 35 33 320 39.4 % 2484 26.2
%
21 Comet B60 : 2395 33 27 420 36.0 % 2495 24.3
%
22 SlowChess v2.78 : 2338 48 29 319 29.6 % 2488 16.6
%
The six different Yace configurations scored within 21 ELO points, so that the
test shows no statistically measureable impact of tablebases and bitbases on the
playing strength of Yace, regardless whether they are probed less or more
aggressively or are combined with bitbases or not.
I also looked at whether there are differences with respect to at which stage a
game is won. For that purpose (resign was off where possible) I divided the
games into three classes: games that have reached positions with 5 or fewer
pieces (i.e. they reached positions contained in the 5 piece EGTBs), games that
are 60 or fewer moves long (i.e. games that probably have already been decided
during middlegame) and finally those games that are longer than 60 moves, but
did not reach positions with 5 or fewer pieces. The idea is that the last group
of games are those that were with a high probability decided during the endgame,
where probing tablebases and bitbases during the search plays an important role.
Rating taking into account only those games that at some stage reached positions
with 5 or fewer pieces:
Program Elo + - Games Score Av.Op.
Draws
1 Ruffian v1.0.1 : 2663 59 62 78 73.7 % 2484 42.3
%
2 Pepito v1.59 profile : 2641 70 47 59 72.0 % 2476 55.9
%
3 Comet B60 : 2609 69 32 67 67.2 % 2484 65.7
%
4 Pharaon v2.62 : 2594 58 49 92 67.4 % 2468 47.8
%
5 Aristarch v4.21 : 2558 77 54 59 63.6 % 2462 52.5
%
6 Yace Paderborn : 2557 41 26 218 59.2 % 2492 56.0
%
7 Yace Paderborn nobb : 2540 46 24 198 54.8 % 2507 62.1
%
8 Aristarch v4.4 : 2538 80 47 66 56.8 % 2490 56.1
%
9 Yace Paderborn tbu1 : 2528 47 25 198 53.8 % 2501 60.1
%
10 Amy v0.8.3 : 2523 90 51 55 55.5 % 2485 56.4
%
11 Crafty v17.14DC : 2506 67 35 101 53.5 % 2482 59.4
%
12 SoS 3 : 2501 80 44 78 50.6 % 2496 52.6
%
13 Yace Paderborn tbu1nobb : 2499 25 46 210 47.4 % 2517 57.6
%
14 Yace Paderborn tbu0 : 2494 24 44 226 46.9 % 2515 58.4
%
15 Little Goliath 2000 v3.9 : 2493 35 79 80 49.4 % 2498 66.2
%
16 Green Light Chess v3.00 : 2464 32 65 95 43.2 % 2511 65.3
%
17 Crafty v19.01DC : 2454 44 76 74 43.9 % 2497 55.4
%
18 Yace Paderborn notbbb : 2442 26 39 222 38.7 % 2522 55.0
%
19 Tao v5.4 : 2402 59 61 90 36.1 % 2501 36.7
%
20 LambChop v10.99 : 2368 57 60 85 32.9 % 2492 42.4
%
21 SlowChess v2.78 : 2325 97 74 51 25.5 % 2512 31.4
%
22 Francesca M.0.0.9 : 2288 76 50 94 21.3 % 2515 31.9
%
This list shows the engines and Yace versions without tablebase support with
considerably weaker ratings than their overall rating, while engines with
tablebase support lead the pack. Also less aggressively probing Yace versions
have a lower rating than more aggressively probing ones. For Yace the difference
between the version with no tablebase and bitbase support and full support is
115 ELO points and statistically significant.
Rating taking into account only games longer than 60 moves that did not reach
positions with 5 or fewer pieces:
Program Elo + - Games Score Av.Op.
Draws
1 Little Goliath 2000 v3.9 : 2632 40 62 167 70.1 % 2485 20.4
%
2 Green Light Chess v3.00 : 2598 41 58 178 65.7 % 2485 16.9
%
3 Ruffian v1.0.1 : 2590 41 51 187 64.4 % 2486 23.0
%
4 Crafty v17.14DC : 2571 43 51 182 62.1 % 2486 20.9
%
5 Yace Paderborn notbbb : 2567 33 35 324 59.3 % 2502 24.7
%
6 Aristarch v4.4 : 2552 45 53 175 60.0 % 2482 16.0
%
7 Pepito v1.59 profile : 2545 43 42 214 56.8 % 2498 22.0
%
8 Aristarch v4.21 : 2522 58 62 116 57.3 % 2471 16.4
%
9 Pharaon v2.62 : 2519 52 49 156 54.5 % 2488 19.2
%
10 Yace Paderborn tbu0 : 2517 38 30 334 50.7 % 2511 23.7
%
11 Yace Paderborn tbu1 : 2515 37 31 338 50.9 % 2509 21.3
%
12 Yace Paderborn tbu1nobb : 2513 38 30 339 50.3 % 2511 22.1
%
13 Yace Paderborn nobb : 2510 29 37 350 49.9 % 2511 23.7
%
14 LambChop v10.99 : 2509 53 45 162 52.5 % 2492 23.5
%
15 Yace Paderborn : 2473 31 32 392 44.5 % 2511 21.7
%
16 Francesca M.0.0.9 : 2444 53 53 142 43.0 % 2493 21.1
%
17 SlowChess v2.78 : 2441 56 56 128 43.4 % 2487 19.5
%
18 SoS 3 : 2433 49 41 208 39.2 % 2510 16.8
%
19 Crafty v19.01DC : 2410 46 37 239 37.4 % 2499 18.0
%
20 Amy v0.8.3 : 2389 63 41 180 33.6 % 2507 10.6
%
21 Comet B60 : 2369 56 37 203 31.5 % 2504 18.7
%
22 Tao v5.4 : 2365 61 37 204 31.1 % 2503 13.2
%
This rating list shows those engines and Yace versions ahead that do not probe
tablebases, even SlowChess and Francesca are ahead of e.g. SoS and Crafty 19.1.
For Yace probing bitbases and tablebases results in a statistically significant
decrease of its rating by 94 ELO points for such games.
Obviously (and logically) engines probing tablebases tend to play into won 5
piece positions (and avoid lost 5 piece positions), hence their high rating in
such games. However, in the preceding endgame phase with still more pieces on
the board tablebases somehow must be a sort of liability, so that engines using
them lose games during this phase over proportion. Apparently statistically for
each game they successfully manoeuver into a won 5 piece position, they also
lose another one during the preceding endgame, so that overall this effect more
or less cancels out the advantages of tablebases. I doubt that this is a
tactical advantage, because even aggressively probing engines that use only
little CPU time at this stage still search deeper than engines without tablebase
support.
As a conclusion it appears that tablebases and bitbases are no real advantage
(which was also the result of previous experiments), although it is still not
clear - at least for me - why this is so. Any ideas?
Robert
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.