Author: Robert Allgeuer
Date: 11:40:22 08/05/04
I have run a test with Fruit 1.5 aiming at determining, which of its parameters
have a positive, and which a negative impact on Fruit´s playing strength.
Method:
=======
The test consisted of a round robin tournament of several configurations of
Fruit 1.5 and a set of reference engines. The reason why this approach was
chosen is that I did not want to limit this test to a mere self-play test of the
different Fruit configurations, because results of a self-play test may not be
representative of the playing strength against other opponents.
The Nunn 1 starting positions were used; for each pairing each engine had to
play both sides, resulting in 20 games for each pairing and 3800 games overall.
The tournament results have been analysed with Elostat and a corresponding
rating table has been calculated.
Platform, Tools and Settings:
=============================
Athlon XP 2400+
1.1 GB RAM
Windows XP
Elostat 1.1b
Arena 1.08
Time Control: 5min + 2sec
Ponder off
EGTBs enabled when supported
64MB Hash
Participants:
=============
Seven different configurations of Fruit 1.5, including the default settings and
six settings with always exactly one UCI-parameter modified:
Fruit v1.5def: Fruit 1.5 with the default parameter setting
Fruit v1.5nmalways: nullmove search is tried always (instead of in the fail-high
case only)
Fruit v1.5noetc: ETC disabled
Fruit v1.5ppushext: pawn push extension (7th rank) enabled
Fruit v1.5nosinglerep: single reply extension disabled
Fruit v1.5noqchecks: quiescence search does not include checking moves
Fruit v1.5nmR2: nullmove reduction set to 2 instead of the default 3
plus 13 other engines.
Results:
========
Program Elo + - Games Score Av.Op. Draws
1 Ruffian v1.01 : 2695 26 42 380 70.7 % 2543 21.3 %
2 List v5.12 : 2664 28 37 380 66.6 % 2544 23.7 %
3 El Chinito v3.25 : 2643 29 35 380 63.7 % 2545 23.7 %
4 Gothmog v0.4.8 : 2604 31 33 380 58.2 % 2547 19.5 %
5 Fruit v1.5nmalways : 2596 32 31 380 57.0 % 2548 23.9 %
6 Fruit v1.5noetc : 2572 34 30 380 53.3 % 2549 20.8 %
7 Fruit v1.5ppushext : 2571 34 29 380 53.2 % 2549 24.2 %
8 Fruit v1.5def : 2568 34 29 380 52.8 % 2549 22.9 %
9 Fruit v1.5nosinglerep : 2560 35 27 380 51.4 % 2550 28.2 %
10 Fruit v1.5noqchecks : 2554 35 30 380 50.5 % 2550 19.5 %
11 Ktulu v5.0 : 2554 35 27 380 50.5 % 2550 29.5 %
12 AnMon v5.21 : 2552 36 28 380 50.3 % 2550 24.7 %
13 SoS4 : 2547 28 35 380 49.5 % 2550 24.2 %
14 Amyan v1.592 : 2537 29 35 380 48.0 % 2551 22.4 %
15 Fruit v1.5nmR2 : 2534 29 34 380 47.5 % 2551 24.5 %
16 Yace Paderborn : 2509 33 32 380 43.8 % 2552 18.2 %
17 Ufim v5.00 : 2460 35 29 380 36.7 % 2555 22.4 %
18 Frenzee v1.59 : 2439 39 28 380 33.8 % 2556 18.7 %
19 Patzer v3.61 : 2424 40 27 380 31.7 % 2557 19.7 %
20 Sjeng v12.13 : 2417 42 27 380 30.9 % 2557 19.2 %
Not surprisingly the differences in playing strength due to the different
parameter settings are statistically not significant, even after 3800 games.
Nevertheless I would dare following interpretation:
Parameter settings that probably increase Fruit´s playing strength:
- Always trying nullmoves; it seems that the fail-high condition is a bit too
aggressive and skips nullmove searches that in fact would have failed high
Parameter settings that probably are performance neutral:
- Disabling ETC (although I reckon that at longer time controls and deeper
search depths ETC should give a better return and could yield an increase in
playing strength)
- Enabling pawn push extensions
- Disabling single reply extensions
Parameter settings that probably decrease playing strength slightly:
- Disabling checks in quiescence search
Parameter settings that probably decrease playing strength:
- Reducing the nullmove reduction to 2
Conclusion:
===========
Generally the impact of the different parameter settings on Fruit´s playing
strength is comparatively small.
I personnally am a bit surprised that enabling/disabling the extensions makes
pretty much no difference, and would be interested in views as to why this would
be the case.
I also would have expected that not searching checking moves in the quiescence
search has a bigger (negative) impact than measured here.
I am currently extending this test by testing two further parameter settings:
- checks in quiescence search only after a nullmove
- the alternative material piece values as proposed by J. Rang
Eventually I plan to also test the combination of the best parameters in order
to see whether improvements add up or not.
Robert (A.)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.