Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Puzzled about testsuites

Author: Ed Schröder

Date: 14:15:01 03/09/04

Go up one level in this thread


On March 09, 2004 at 11:28:16, Michel Langeveld wrote:

Michel,

Use test suites only if you are testing search related stuff. Whatever change
you make (tactics, search, positional) it should be tested by playing games.
Because the lack off an in-house GM play comp-comp, 200 at minimum, better
400-600. I do 400 myself.

My best,

Ed



>Last week I worked hard on Nullmover and the ecm-gcp testsuite in particulair.
>I thought this testsuite is a fast way to the tune middlegame knowledge, kick
>things out and get things back in and do multiple testruns.
>
>I did the following:
>*1 kicked my kingsafety totally and use only pawnshelter
>*2 kicked my mobility totally out
>*3 added more information in the pawn struct
>
>*1 gives me +- 10% performance back
>*2 gives me +- 10% performance back
>*3 gives me +- 5% extra performance.
>
>I know that nps is not important but on a P3-800Mhz most I get most of the time
>around 450-500knps. On a PIV 2.4Ghz machine I have most of the time
>1000-1100knps.
>
>I tested:
>- testsuite: ecm-gcp.epd
>- timecontrol: 5 seconds
>- hardware: P3 800Mhz machine
>- hashtable: 50Mb
>
>P3 800: The old nullmover (v0.25) scored 58 positions right in 5 seconds.
>P3 800: The new nullmover (v0.25c) scores 75!! positions right in 5 seconds.
>
>I also tested on 10 seconds a move and there it is also an improvement:
>
>P3 800: The old Nullmover (v0.25) scores 78 positions right in 5 seconds.
>P3 800: The new Nullmover (v0.25c) scores 95!! positions right in 10 seconds.
>
>I made also another testing mechanism that emulates a game. It skips the opening
>and I use this to try the hashing algorithm:
>
>HASHTEST 10
>Test01:  3080962 time= 6.31 FH= 96.6% hhit%:  6.1% a4xb6=27
>Test02:  4408305 time= 8.92 FH= 94.4% hhit%:  7.1% f4xe5=23
>Test03: 11364980 time=21.20 FH= 92.5% hhit%:  5.8% o-o=-4
>Test04:  5378200 time= 9.73 FH= 93.4% hhit%:  6.3% h2-h3=25
>Test05:  2149098 time= 4.02 FH= 95.3% hhit%:  7.2% d5-b3=26
>Test06:  9823962 time=17.91 FH= 88.9% hhit%:  5.8% h2-h3=26
>Test07:  2493285 time= 4.39 FH= 94.0% hhit%:  6.7% c1-g5=24
>Test08:  2344532 time= 4.05 FH= 96.4% hhit%:  6.4% g5xf6=49
>Test09:  5186665 time= 9.06 FH= 90.6% hhit%:  6.8% h2-h3=50
>Test10:  2473415 time= 4.36 FH= 92.0% hhit%:  8.6% h2-h3=50
>Test11:  1252083 time= 2.50 FH= 88.5% hhit%:  8.2% e2xf3=52
>Test12:  4202884 time= 7.72 FH= 87.8% hhit%:  8.2% f3-h5=58
>Test13:  4188662 time= 8.89 FH= 84.5% hhit%:  8.7% f3-h5=62
>Test14:  2227369 time= 4.19 FH= 86.3% hhit%:  8.7% f2-c2=52
>Test15:  3483631 time= 6.33 FH= 87.6% hhit%: 10.8% a2-a4=38
>Test16:  2648911 time= 4.84 FH= 87.5% hhit%: 10.3% a2-a4=21
>Test17:  2582746 time= 4.61 FH= 87.4% hhit%: 10.7% a2-a4=25
>Test18:  1975938 time= 3.58 FH= 90.0% hhit%: 11.4% f2-e3=24
>Test19:  2570348 time= 4.53 FH= 89.5% hhit%: 12.2% a2-a4=13
>Test20:  2786429 time= 5.20 FH= 91.6% hhit%: 14.7% g3-e3=7
>Test21:  1312456 time= 2.61 FH= 88.8% hhit%: 14.7% f2-h4=-1
>Test22:  2294598 time= 4.26 FH= 87.2% hhit%: 13.5% a2-a4=4
>Test23:  3625229 time= 6.95 FH= 87.3% hhit%:  9.7% e1-e2=5
>Test24:  1322411 time= 2.72 FH= 88.5% hhit%: 10.5% e1-e3=-1
>Test25:   543736 time= 1.33 FH= 93.3% hhit%:  9.0% e3-f3=-5
>Test26:  2453827 time= 4.95 FH= 89.9% hhit%:  8.2% g1-h1=5
>Test27:  2146519 time= 4.33 FH= 91.8% hhit%:  8.4% d5-a2=5
>Test28:  1233153 time= 2.56 FH= 91.3% hhit%: 10.2% h2-f2=24
>Test29:   560284 time= 1.25 FH= 89.7% hhit%: 13.1% g1xf2=22
>Test30:  1673102 time= 3.27 FH= 85.9% hhit%: 11.1% d1-b1=37
>Test31:   651264 time= 1.45 FH= 83.0% hhit%: 14.0% a2-d5=48
>Test32:   491891 time= 1.16 FH= 84.7% hhit%: 13.3% g1-g2=51
>Test33:  1812927 time= 3.36 FH= 87.0% hhit%: 11.8% g1-f2=53
>Test34:   535576 time= 1.17 FH= 87.8% hhit%: 10.9% c3xd4=44
>Test35:   783155 time= 1.83 FH= 91.2% hhit%: 15.0% e1-c1=70
>Test36:   610921 time= 1.41 FH= 95.5% hhit%: 10.8% h3-h2=75
>Test37:   618524 time= 1.38 FH= 92.3% hhit%: 10.6% h2-c2=80
>Test38:   699858 time= 1.55 FH= 94.1% hhit%: 13.1% c1-d1=92
>Test39:   488832 time= 1.11 FH= 96.7% hhit%:  9.5% c2xc3=324
>Test40:   118758 time= 0.42 FH= 97.0% hhit%: 12.2% c1xc3=408
>Test41:   180520 time= 0.56 FH= 94.9% hhit%: 14.7% c3-c7=409
>Test42:   189391 time= 0.58 FH= 96.0% hhit%: 12.4% c7xf7=535
>Test43:   102407 time= 0.41 FH= 93.1% hhit%: 13.1% f7xf6=591
>Test44:   101539 time= 0.44 FH= 90.6% hhit%: 15.6% f6xd6=611
>TOTAAL: 101173283 time=193.39 nps=523157
>
>I expected a big elo improvement with this new version when playing comp vs comp
>games.
>
>The opposite happened. :-( The new version scores 122 elo!! worse. I tested this
>on the same hardware (P3 800Mhz) with a timecontrol 40 moves / 5 minutes games
>as the older version.
>
>26 Nullmover 0.25 : 2019   58  50   148    50.3 %   2017   14.2 %
>65 Nullmover 0.25c: 1897  115  85    44    30.7 %   2039   20.5 %
>
>I should play more games to have more more reliable elo rating but this is not
>what I expected. And it looks already so worse that futher testing doesn't make
>sense.
>
>I am very puzzled. Somewhere I should have made a mistake. Maybe my approach for
>tuning on ecmgcp.epd was wrong.
>
>My questions:
>* Is 5 seconds a move a good timecontrol to test ecmgcp.epd ?
>* Does it makes sense to use a reall long timecontrol?
>* I learnt that ecmgcp.epd is much more tactical as strategical. Does someone
>has other/ better positional/eval-tweaking advice? No testsuites? Other
>testsuite, like lct2.epd, the comeplete ecm.epd ?



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.