Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Evaluation Autotuning

Author: Vincent Diepeveen

Date: 09:33:24 06/29/04

Go up one level in this thread


On June 28, 2004 at 18:12:03, Robert Hyatt wrote:

>On June 28, 2004 at 16:42:33, Vincent Diepeveen wrote:
>
>>On June 28, 2004 at 08:54:00, Anthony Cozzie wrote:
>>
>>Anthony, an important question.
>>
>>My question is on which grounds you decide to pick a parameter P out of the
>>total number of tunable parameters in crafty's evaluation to tune.
>>
>>By the way, how can you draw conclusions based upon 10 games?
>
>Where did he say "10 games"?  He asked _many_ people to play 10 game matches
>using first the old values, then the new values.  Add up _many_ people and you
>get many*10 games...

In short conclusions to change a single parameter are based upon 10 game matches
and the assumption taken is that it is possible to independantly tune each
parameter.

>
>>
>>Of course if you do some random change which is horrible then it will be figured
>>out pretty soon. Like doubled pawn gets a 5.0 pawn bonus instead of a 0.1 pawn
>>penalty. Such a case 10 games will detect perhaps.
>>
>>But if you pick 0.05 for a doubled pawn or 0.5 for a doubled pawn your test in
>>10 games won't show the difference with a huge statistical confidence on what is
>>better.
>>
>>>Most of you probably know by now that my "secret project" with Bob has been
>>>evaluation autotuning.  I feel that with the improvements in parallel search and
>>>CPU speed, the evaluation is beginning to be become more important than the
>>>search in top programs  (e.g. Shredder).  As the number of terms increases, the
>>>number of relations increases quadratically and the problem becomes harder.  I
>>>don't think that autotuning will ever be as good as hand tuning, but it is a
>>>good way to get a first guess for the value of a parameter, as well as seeing
>>>whether a new parameter is worth anything.  This weekend, aside from watching
>>>massive amounts of "Hajime no Ippo", I have finished a run with the latest
>>>tweaks to the algorithm.
>>>
>>>The basic idea (which actually dates to my senior year in college, when I took
>>>CAD tools, 18760) is to use simulated annealing. Simulated annealing makes
>>>random changes and accepts them probablistically:
>>>
>>>(V1 > V0) : accept change
>>>(V1 < V0) : accept if rand() > exp((V1 - V0)/T)
>>>
>>>Where T is the temperature, and it gradually gets smaller during the anneal.
>>>This has the effect of ignoring small changes initially, and concentrating on
>>>the big picture. This is the best link I can find in a few minutes of google
>>>search; I will explain more if people are still unclear:
>>>
>>>http://members.aol.com/btluke/simann1.htm
>>>
>>>I am looking for some testers to run games with the latest crafty against
>>>various engines. Basically, run N games against challenger X with the standard
>>>settings, and then N games with the new settings.  I am only really interested
>>>in longer timecontrols: 20 min + on an Athlon 2.0G or so (70 min on P-650, etc),
>>>and at least 10 games.  If this is ever published, I will include everyone who
>>>runs some test games in the acknowledgements section.
>>>
>>>anthony
>>>
>>>The crafty evaluation script:
>>>
>>>evaluation   1     100
>>>evaluation   2     300
>>>evaluation   3     300
>>>evaluation   4     500
>>>evaluation   5     900
>>>evaluation  11     100
>>>evaluation  12       0
>>>evaluation  13     100
>>>evaluation  14     100
>>>evaluation  15     100
>>>evaluation  16     100
>>>evaluation  17     120
>>>evaluation  21      10
>>>evaluation  22       3
>>>evaluation  23      16
>>>evaluation  24       1
>>>evaluation  25      50
>>>evaluation  26       8
>>>evaluation  27      13
>>>evaluation  28     525
>>>evaluation  29     150
>>>evaluation  30      50
>>>evaluation  31     100
>>>evaluation  32 10 10 10 10 10 10 10 10 9 22 -2 -11 21 50 15 10 13 20 16 17 26 21
>>>25 16 7 4 3 11 12 16 10 6 -1 1 7 11 7 13 8 0 -1 -1 4 9 4 6 5 0 -3 -5 -5 -7 -9 4
>>>4 -2 -3 -3 -3 -3 -3 -3 -3 -3
>>>evaluation  33 0 2 2 2 23 43 212 212
>>>evaluation  34 0 0 0 0 0 0 7 7
>>>evaluation  35 0 1 2 12 23 53 117 117
>>>evaluation  36 0 6 6 10 13 22 43 43
>>>evaluation  37 0 5 15 36 49 55 63 63 63
>>>evaluation  38 0 9 28 44 68 89 89 89 89
>>>evaluation  39 0 3 9 9 9 9 9
>>>evaluation  40 0 0 6 24 40 64 80 94 96
>>>evaluation  41 0 0 0 0 12 60 100 0
>>>evaluation  42 45 41 20 19 18 13 13 9 9 9 9 9 6 6 6 6 5 4 3 3 3 3 3 3 3 3 3 3 3
>>>3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
>>>3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
>>>3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 0
>>>evaluation  43 72 36 36 36 34 31 30 30 27 27 24 24 21 21 18 18 15 15 12 12 10 10
>>>10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
>>>10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
>>>10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
>>>10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
>>>evaluation  51 6 6 4 3 1 0 0 0
>>>evaluation  52 -32 -16 -8 -8 -8 -8 -16 -32 -24 -24 4 4 4 4 -24 -24 -8 2 6 6 6 6
>>>2 -8 -8 2 6 6 6 6 2 -8 -8 2 4 4 4 4 2 -8 -8 2 2 2 2 2 2 -8 -8 -8 0 0 0 0 -8 -8
>>>-8 -8 -8 -8 -8 -8 -8 -8
>>>evaluation  53 0 0 0 0 0 0 0 0 0 0 6 12 12 6 0 0 0 0 12 18 18 12 0 0 0 0 12 15
>>>15 12 0 0 0 0 0 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>>evaluation  61      10
>>>evaluation  62       2
>>>evaluation  63      12
>>>evaluation  64       4
>>>evaluation  65     175
>>>evaluation  66 -12 -18 6 19 18 22 11 -2 -13
>>>evaluation  67 7 7 6 3 1 0 0 0
>>>evaluation  68 -16 -16 0 0 0 0 -16 -16 -16 2 4 4 4 4 2 -16 -4 2 6 6 6 6 2 -4 -4
>>>2 6 6 6 6 2 -4 -4 2 4 4 4 4 2 -4 -4 2 2 2 2 2 2 -4 -16 2 0 0 0 0 2 -16 -16 -16
>>>-8 -8 -8 -8 -16 -16
>>>evaluation  71      16
>>>evaluation  72       3
>>>evaluation  73       4
>>>evaluation  74       9
>>>evaluation  75      17
>>>evaluation  76      10
>>>evaluation  77      20
>>>evaluation  78      23
>>>evaluation  79      10
>>>evaluation  80 9 9 6 4 2 0 0 0
>>>evaluation  81 6 6 2 1 1 0 0 0
>>>evaluation  82 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0
>>>0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0
>>>evaluation  91      22
>>>evaluation  92       6
>>>evaluation  93      50
>>>evaluation  94       6
>>>evaluation  95       8
>>>evaluation  96 10 10 10 4 0 0 0 0
>>>evaluation  97 9 9 5 0 0 0 0 0
>>>evaluation  98 0 0 0 0 0 0 0 0 -15 0 4 5 5 4 0 -15 0 2 4 10 10 4 2 0 0 2 10 12
>>>12 10 2 0 -10 2 10 12 12 10 2 -10 -10 -10 4 10 10 4 -10 -10 -10 2 8 8 8 8 2 -10
>>>-10 -8 0 0 0 0 -8 -10
>>>evaluation 101      15
>>>evaluation 102      96
>>>evaluation 103     600
>>>evaluation 104       7
>>>evaluation 105     -86
>>>evaluation 106       3
>>>evaluation 107 0 0 0 0 0 0 0 0 1 1 1 2 2 4 4 4 4 7 11 13 13 18 20 20 27 27 30 37
>>>37 37 37 37 37 43 43 43 47 47 47 47 53 54 57 59 61 61 61 61 61 61 64 65 65 65 66
>>>68 69 72 72 75 80 80 87 87 97 98 99 108 154 165 184 217 245 249 286 397 426 553
>>>689 799 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922
>>>922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922
>>>922 922 922 922 922 922 922 922 922 922
>>>evaluation 108 8 8 8 8 8 8 8 8 7 7 7 7 7 7 7 7 6 6 6 6 6 6 6 6 5 5 5 5 5 5 5 5 4
>>>4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 1 1 2 3 3 2 1 1 1 0 2 3 3 2 0 1
>>>evaluation 109 -50 -30 -30 -30 -30 -30 -30 -50 -30 -30 0 0 0 0 -30 -30 -30 -10
>>>30 40 40 30 -10 -30 -30 -10 30 40 40 30 -10 -30 -30 -10 20 30 30 20 -10 -30 -30
>>>-10 0 10 10 0 -10 -30 -30 -20 -10 0 0 -10 -20 -30 -50 -40 -30 -20 -20 -30 -40
>>>-50
>>>evaluation 110 -90 -70 -50 -30 -30 -30 -30 -30 -70 -50 -30 -30 0 0 0 -10 -70 -50
>>>-30 -10 30 30 30 0 -70 -50 -30 -10 30 30 30 0 -70 -50 -30 -10 30 30 30 0 -70 -50
>>>-30 -10 20 20 20 0 -70 -50 -30 -20 -10 -10 -10 -10 -90 -70 -50 -40 -30 -20 -20
>>>-30
>>>evaluation 111 -30 -30 -30 -30 -30 -50 -70 -90 -10 0 0 0 -20 -30 -50 -70 0 30 30
>>>30 -10 -30 -50 -70 0 30 30 30 -10 -30 -50 -70 0 30 30 30 -10 -30 -50 -70 0 20 20
>>>20 -10 -30 -50 -70 -10 -10 -10 -10 -20 -30 -50 -70 -30 -20 -20 -30 -40 -50 -70
>>>-90
>>>evaluation 112 0 3 15 16
>>>evaluation 113 0 12 13 16
>>>evaluation 114 0 4 13 20 21 25 31 35 35 39 40 40 40 48 52 56 56 56 56 56 56 56
>>>56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56
>>>56 56 56 56 56 56 56 56 56 56 56 56 56 56 56
>>>evaluation 115 16 16 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25
>>>26 26 27 27 28 28 29 29 30 30 31 31 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32
>>>32 32 32 32 32 32 32 32 32 32 32 32 32 32 32
>>>evaluation 121      10
>>>evaluation 122       7
>>>evaluation 123      12
>>>evaluation 124      20
>>>evaluation 125       2
>>>evaluation 126       3
>>>evaluation 127      20



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.