Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Evaluation Autotuning

Author: Anthony Cozzie

Date: 14:18:57 06/28/04

Go up one level in this thread


On June 28, 2004 at 16:42:33, Vincent Diepeveen wrote:

>On June 28, 2004 at 08:54:00, Anthony Cozzie wrote:
>
>Anthony, an important question.
>
>My question is on which grounds you decide to pick a parameter P out of the
>total number of tunable parameters in crafty's evaluation to tune.

I pick a random one.

>By the way, how can you draw conclusions based upon 10 games?
>
>Of course if you do some random change which is horrible then it will be figured
>out pretty soon. Like doubled pawn gets a 5.0 pawn bonus instead of a 0.1 pawn
>penalty. Such a case 10 games will detect perhaps.
>
>But if you pick 0.05 for a doubled pawn or 0.5 for a doubled pawn your test in
>10 games won't show the difference with a huge statistical confidence on what is
>better.

The hope is that I can get several volunteers. I'd like 2-300 games for some
real statistical significance, but as I have said, I don't have windows or any
commercial chess engines, so I am hoping to get some help.

anthony

>>Most of you probably know by now that my "secret project" with Bob has been
>>evaluation autotuning.  I feel that with the improvements in parallel search and
>>CPU speed, the evaluation is beginning to be become more important than the
>>search in top programs  (e.g. Shredder).  As the number of terms increases, the
>>number of relations increases quadratically and the problem becomes harder.  I
>>don't think that autotuning will ever be as good as hand tuning, but it is a
>>good way to get a first guess for the value of a parameter, as well as seeing
>>whether a new parameter is worth anything.  This weekend, aside from watching
>>massive amounts of "Hajime no Ippo", I have finished a run with the latest
>>tweaks to the algorithm.
>>
>>The basic idea (which actually dates to my senior year in college, when I took
>>CAD tools, 18760) is to use simulated annealing. Simulated annealing makes
>>random changes and accepts them probablistically:
>>
>>(V1 > V0) : accept change
>>(V1 < V0) : accept if rand() > exp((V1 - V0)/T)
>>
>>Where T is the temperature, and it gradually gets smaller during the anneal.
>>This has the effect of ignoring small changes initially, and concentrating on
>>the big picture. This is the best link I can find in a few minutes of google
>>search; I will explain more if people are still unclear:
>>
>>http://members.aol.com/btluke/simann1.htm
>>
>>I am looking for some testers to run games with the latest crafty against
>>various engines. Basically, run N games against challenger X with the standard
>>settings, and then N games with the new settings.  I am only really interested
>>in longer timecontrols: 20 min + on an Athlon 2.0G or so (70 min on P-650, etc),
>>and at least 10 games.  If this is ever published, I will include everyone who
>>runs some test games in the acknowledgements section.
>>
>>anthony
>>
>>The crafty evaluation script:
>>
>>evaluation   1     100
>>evaluation   2     300
>>evaluation   3     300
>>evaluation   4     500
>>evaluation   5     900
>>evaluation  11     100
>>evaluation  12       0
>>evaluation  13     100
>>evaluation  14     100
>>evaluation  15     100
>>evaluation  16     100
>>evaluation  17     120
>>evaluation  21      10
>>evaluation  22       3
>>evaluation  23      16
>>evaluation  24       1
>>evaluation  25      50
>>evaluation  26       8
>>evaluation  27      13
>>evaluation  28     525
>>evaluation  29     150
>>evaluation  30      50
>>evaluation  31     100
>>evaluation  32 10 10 10 10 10 10 10 10 9 22 -2 -11 21 50 15 10 13 20 16 17 26 21
>>25 16 7 4 3 11 12 16 10 6 -1 1 7 11 7 13 8 0 -1 -1 4 9 4 6 5 0 -3 -5 -5 -7 -9 4
>>4 -2 -3 -3 -3 -3 -3 -3 -3 -3
>>evaluation  33 0 2 2 2 23 43 212 212
>>evaluation  34 0 0 0 0 0 0 7 7
>>evaluation  35 0 1 2 12 23 53 117 117
>>evaluation  36 0 6 6 10 13 22 43 43
>>evaluation  37 0 5 15 36 49 55 63 63 63
>>evaluation  38 0 9 28 44 68 89 89 89 89
>>evaluation  39 0 3 9 9 9 9 9
>>evaluation  40 0 0 6 24 40 64 80 94 96
>>evaluation  41 0 0 0 0 12 60 100 0
>>evaluation  42 45 41 20 19 18 13 13 9 9 9 9 9 6 6 6 6 5 4 3 3 3 3 3 3 3 3 3 3 3
>>3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
>>3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
>>3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 0
>>evaluation  43 72 36 36 36 34 31 30 30 27 27 24 24 21 21 18 18 15 15 12 12 10 10
>>10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
>>10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
>>10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
>>10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
>>evaluation  51 6 6 4 3 1 0 0 0
>>evaluation  52 -32 -16 -8 -8 -8 -8 -16 -32 -24 -24 4 4 4 4 -24 -24 -8 2 6 6 6 6
>>2 -8 -8 2 6 6 6 6 2 -8 -8 2 4 4 4 4 2 -8 -8 2 2 2 2 2 2 -8 -8 -8 0 0 0 0 -8 -8
>>-8 -8 -8 -8 -8 -8 -8 -8
>>evaluation  53 0 0 0 0 0 0 0 0 0 0 6 12 12 6 0 0 0 0 12 18 18 12 0 0 0 0 12 15
>>15 12 0 0 0 0 0 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>evaluation  61      10
>>evaluation  62       2
>>evaluation  63      12
>>evaluation  64       4
>>evaluation  65     175
>>evaluation  66 -12 -18 6 19 18 22 11 -2 -13
>>evaluation  67 7 7 6 3 1 0 0 0
>>evaluation  68 -16 -16 0 0 0 0 -16 -16 -16 2 4 4 4 4 2 -16 -4 2 6 6 6 6 2 -4 -4
>>2 6 6 6 6 2 -4 -4 2 4 4 4 4 2 -4 -4 2 2 2 2 2 2 -4 -16 2 0 0 0 0 2 -16 -16 -16
>>-8 -8 -8 -8 -16 -16
>>evaluation  71      16
>>evaluation  72       3
>>evaluation  73       4
>>evaluation  74       9
>>evaluation  75      17
>>evaluation  76      10
>>evaluation  77      20
>>evaluation  78      23
>>evaluation  79      10
>>evaluation  80 9 9 6 4 2 0 0 0
>>evaluation  81 6 6 2 1 1 0 0 0
>>evaluation  82 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0
>>0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0 0 0 4 6 6 4 0 0
>>evaluation  91      22
>>evaluation  92       6
>>evaluation  93      50
>>evaluation  94       6
>>evaluation  95       8
>>evaluation  96 10 10 10 4 0 0 0 0
>>evaluation  97 9 9 5 0 0 0 0 0
>>evaluation  98 0 0 0 0 0 0 0 0 -15 0 4 5 5 4 0 -15 0 2 4 10 10 4 2 0 0 2 10 12
>>12 10 2 0 -10 2 10 12 12 10 2 -10 -10 -10 4 10 10 4 -10 -10 -10 2 8 8 8 8 2 -10
>>-10 -8 0 0 0 0 -8 -10
>>evaluation 101      15
>>evaluation 102      96
>>evaluation 103     600
>>evaluation 104       7
>>evaluation 105     -86
>>evaluation 106       3
>>evaluation 107 0 0 0 0 0 0 0 0 1 1 1 2 2 4 4 4 4 7 11 13 13 18 20 20 27 27 30 37
>>37 37 37 37 37 43 43 43 47 47 47 47 53 54 57 59 61 61 61 61 61 61 64 65 65 65 66
>>68 69 72 72 75 80 80 87 87 97 98 99 108 154 165 184 217 245 249 286 397 426 553
>>689 799 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922
>>922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922 922
>>922 922 922 922 922 922 922 922 922 922
>>evaluation 108 8 8 8 8 8 8 8 8 7 7 7 7 7 7 7 7 6 6 6 6 6 6 6 6 5 5 5 5 5 5 5 5 4
>>4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 1 1 2 3 3 2 1 1 1 0 2 3 3 2 0 1
>>evaluation 109 -50 -30 -30 -30 -30 -30 -30 -50 -30 -30 0 0 0 0 -30 -30 -30 -10
>>30 40 40 30 -10 -30 -30 -10 30 40 40 30 -10 -30 -30 -10 20 30 30 20 -10 -30 -30
>>-10 0 10 10 0 -10 -30 -30 -20 -10 0 0 -10 -20 -30 -50 -40 -30 -20 -20 -30 -40
>>-50
>>evaluation 110 -90 -70 -50 -30 -30 -30 -30 -30 -70 -50 -30 -30 0 0 0 -10 -70 -50
>>-30 -10 30 30 30 0 -70 -50 -30 -10 30 30 30 0 -70 -50 -30 -10 30 30 30 0 -70 -50
>>-30 -10 20 20 20 0 -70 -50 -30 -20 -10 -10 -10 -10 -90 -70 -50 -40 -30 -20 -20
>>-30
>>evaluation 111 -30 -30 -30 -30 -30 -50 -70 -90 -10 0 0 0 -20 -30 -50 -70 0 30 30
>>30 -10 -30 -50 -70 0 30 30 30 -10 -30 -50 -70 0 30 30 30 -10 -30 -50 -70 0 20 20
>>20 -10 -30 -50 -70 -10 -10 -10 -10 -20 -30 -50 -70 -30 -20 -20 -30 -40 -50 -70
>>-90
>>evaluation 112 0 3 15 16
>>evaluation 113 0 12 13 16
>>evaluation 114 0 4 13 20 21 25 31 35 35 39 40 40 40 48 52 56 56 56 56 56 56 56
>>56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56 56
>>56 56 56 56 56 56 56 56 56 56 56 56 56 56 56
>>evaluation 115 16 16 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25
>>26 26 27 27 28 28 29 29 30 30 31 31 32 32 32 32 32 32 32 32 32 32 32 32 32 32 32
>>32 32 32 32 32 32 32 32 32 32 32 32 32 32 32
>>evaluation 121      10
>>evaluation 122       7
>>evaluation 123      12
>>evaluation 124      20
>>evaluation 125       2
>>evaluation 126       3
>>evaluation 127      20



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.