Author: Christopher A. Morgan
Date: 19:47:10 12/26/02
389 Middle and Endgame Test Positions, 23 Engines - Results, Comparisons
INTRODUCTION:
Encyclopedia Chess Middlegame (ECM) and Modern Endgame Studies (MES) Test
Suites.
Files available on request in PGN and ChessBase (.cbv) format (additional
documentation below):
ECM-GCP+ 194, It is 194 positions from middlegames for tests of tactical
ability
MES 1-195, First 195 of 200 end game positions for pawns(s), pawn(s) and
Knight(s), and pawn(s) and bishop(s) positions only.
Athlon 750, 384MB RAM, 96MB hash, 30 seconds maximum time to solve, run on Fritz
7 and Fritz 8 GUI.
HEADER EXPLANATIONS FOR TABLE:
The only way I could properly format the table below for this post was to have
the name of the chess engine to the far right, rather than to the far left as is
usual. A larger table with four additional columns of data, more descriptive
table headers, and much better formatting is available on request both in Excel
and in Word, both formatted for one page letter size printing.
(A) - ECM positions test rank, first sorted by percentage solved (same as column
B), then by column (C) time (average time to solve the solved problems), #1 is
best.
(B) - Percent of the 194 ECM test positions solved within the 30 seconds
allowed.
(C) - Average time to solve the ECM solved problems, excludes time for unsolved
problems.
(D) - MES positions test rank, sorted the same as for the ECM tests in column
(A).
(E) - Percent of the 195 MES test positions solved within the 30 seconds
allowed.
(F) - Average time to solve the MES solved positions, excludes time for unsolved
problems.
(G) - Combined average rank for ECM and MES tests = (ECM rank + MES rank)/2.
(H) - Overall rank. Determined by first sorting column (G), combined average
ECM and MES ranks, then by column (B), ECM percentage solved, and finally by
column (E), MES percentage solved. This results in higher overall rank for a
chess engine if it solves more ECM problems, but only if it’s average ECM and
MES rank is the same as another engine, in effect giving a slight nod to an
engine with better tactical ability.
(I) - Chess engine name.
TABLE: Below, column (H) is overall ranking, column (I) is chess engine name.
Ranks
ECM ECM % ECM Avg MES MES % MES Avg ECM+
Rank Solved Time Rank Solved Time MES/2
(A) (B) (C) (D) (E) (F) (G) (H) (I)
6 80.9% 6.42 2 94.4% 2.49 4.0 1 Fritz 6
5 80.9% 4.96 4 91.8% 1.37 4.5 2 Fritz 8
4 83.5% 6.36 7 89.7% 1.54 5.5 3 Hiarcs 8
8 79.4% 5.09 3 93.8% 1.93 5.5 4 Deep Fritz
3 85.1% 4.13 10 88.2% 1.95 6.5 5 KnightDreamer 3.1
12 77.3% 4.88 1 95.4% 2.18 6.5 6 Fritz 7
2 86.1% 6.34 15 85.6% 3.48 8.5 7 Goliath Lt 1.5
1 88.7% 3.98 17 84.6% 1.36 9.0 8 Nimzo 8
14 76.8% 6.06 5 90.8% 2.53 9.5 9 Yace 0.99.56
13 76.8% 5.81 6 90.3% 2.15 9.5 10 Shredder 6.02
7 79.9% 6.51 14 85.6% 3.33 10.5 11 Little Goliath 2000 v3.8
9 79.4% 5.30 12 87.2% 2.65 10.5 12 Gambit Tiger 2
10 78.4% 6.07 11 87.7% 1.98 10.5 13 Ruffian 1.0.1
15 74.7% 7.05 9 88.7% 2.06 12.0 14 Shredder 7
11 77.8% 6.64 16 85.1% 2.31 13.5 15 List 504
16 73.7% 4.48 13 87.2% 2.66 14.5 16 Chess Tiger 14
23 57.7% 9.01 8 89.7% 2.24 15.5 17 Crafty 19.01
18 70.6% 5.44 19 83.6% 2.69 18.5 18 Junior 7
20 69.1% 7.82 18 83.6% 2.18 19.0 19 Pharaon 2.62
17 72.2% 7.16 22 80.0% 1.27 19.5 20 Sjeng 12.13
21 58.8% 7.43 20 83.1% 3.01 20.5 21 Comet B54
19 69.1% 5.99 23 77.9% 2.53 21.0 22 Movei 00_7902
22 58.2% 9.75 21 82.1% 2.96 21.5 23 LambChop 10.88
ADDITIONAL DOCUMENTATION:
The ECM-GCP+ file is based on a recent post containing the 183 ECM-GCP files
compiled by Gian-Carlo Pascutto here in CCC in September 2001 and tested by
members here. The 183 ECM-GCP positions are available for download as part of
his Sjeng 12.13 UCI/Winboard program at http://sjeng.org. There were an
additional 18 ECM positions in the recent post for a total of 201 positions I
started with. I initially tested nine of the better engines on all 201
positions for the 30 seconds maximum time, and deleted seven position that none
of the nine engines solved in the 30 seconds for the final test runs of all 23
engines. Thus, for all 194 positions at least one of the 23 engines solved a
given position within the 30 seconds. All problems have the text solution,
none have alternate solutions. Additional ECM (Encyclopedia of Chess
Middlegame) test suite positions files of 700 positions, and more, in PGN, can
be found at Maro’s Chess Test Suite Positions page at
http://www.geocities.com/CapeCanaveral/Launchpad/2640/pgn/tests/.
The MES 1-195 end game positions file was taken from the first 200 of the
MES.zip file, Modern Endgame Studies file, of over 1200 endgame positions at
Maro’s site (link given above). As with the ECM-GCP+ file, I first tested all
200 positions, this time on five of the better engines, and excluded five
positions not solved by any engine, leaving 195 final positions for testing by
the 23 engines. All problems have the text solution only, no alternate solution
is given. Thus, for all 195 positions at least one of the 23 engines solved a
given position within the 30 seconds
Version 7.0.0.7 Fritz 7 engine used and version 8.0.0.8 Fritz 8 engine used.
The Shredder 7 engine has no version number.
If you wish to run end game position problems in Fritz 7/8 GUI you have to make
sure your engine does not access the EGTBs, End Game Tablebases. This is made
un-necessarily complicated by ChessBase (CB). If you watch the analysis window,
and you have checked additional analysis information, in an endgame position
with the limited number of pieces for which your EGTB’s will be accessed, you’ll
normally see “tb” and a number given as the EGTBs are accessed. You’ll also
hear the hard drive noise as it is being used. In order to make sure my EGTBs
were not being accessed I had to take the following steps: 1. In the Load Engine
window, make sure the use tablebases box is not checked. 2. In Tools –>
Options–> Tablebases make sure the paths are blank, but remember to put the
paths back in when you want to use the tablebases again. 3. If you have
tablebases in the default directory, c:\My Documents\ChessBase\Tablebases change
the name of the Tablebases directory - deleting the path in 2, above, is not
enough. You’d think that just not checking the use tablebases box in the Load
Engine dialogue would be enough, but it is not.
FILES AVAILABLE ON REQUEST:
I have any one or all of eight files available on request: Two ChessBase (.cbv
format) and two PGN files for the two different position test suites; three
Excel files, one of the table here expanded with four additional columns of
data, and much better formatting (prints on one letter size page), and two
additional Excel files, each of which shows for each position each of the 23
engine’s time to solve, or that it didn’t solve it in the time given, and one
Word file for those that don’t have Excel of the expanded table analysis (prints
on one standard letter size page). The additional Excel files are generated by
ChessBase in the Tools–> Analysis–> Process Test Sets–> (after file test file is
selected) Previous Results–> Clip Results. You can paste directly into Excel as
it is formatted correctly for Excel, although you have to do some column width
formatting, etc. in Excel.
COMMENTS:
Although a chess program’s high problem solving ability does not directly
translate into engine overall strength in playing a game, I did find it
interesting that Fritz 8, probably the strongest engine available now, is number
two on the list. It solved the end game positions very quickly at an average of
1.37 seconds. I expected Shredder 7 to be much higher on the list than number 14
overall, way back in the pack.
The absolute times are, of course, not that important. Faster processors will
have much quicker times. The value of the table is in seeing which engines are
better/faster problem solvers, either in the end game, or, tactically, in the
middle game, or on a combined basis when we give both end and middle game
approximate equal weighting. I certainly think you could make a good argument
that middlegame problem solving should be given greater weight than end game
problem solving, but there are lots of end game positions of more than five
pieces in the test suite. And for these tablebases won’t help the engine. In
the available Excel file you can give a different weight to end and middle game
positions, other than the approximate equal weight I gave and re-sort to see
what the difference in the overall ranking would be.
Some of the engines I tested do not use tablebases so the only way to equalize
these engines with those that use tablebases is to not use them in engine
matches, engine tournaments and endgame test positions.
I had no reason for selecting the types of endgame positions, given in the
introduction, above, as I just chose the first 200 positions and those were the
types of endgames in those problems.
Seasons Grettings to all!
Chris
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.