Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Extensions?!

Author: Bruce Moreland

Date: 10:00:40 01/13/98

Go up one level in this thread



On January 13, 1998 at 09:08:49, Dan Homan wrote:

>I can think of a few reasons that my search would have more nodes, but
>none of them should be a factor of 3.  One thing that did occur to me
>is extensions.  I turned off all extensions and researched this
>position.  In doing so, I found that I searched only about 30% of the
>nodes to reach the same depth as the above example.  Should my
>extensions really be tripling the size of my search tree?

There are some things that are hard to measure and some things that are
easy to measure.  I think that I can suggest how to measure some of the
easy things.

I have a utility that reads my log files, figures out how deep the
program got for each problem, and will compare this depth, and the time
taken to attain it, with a log file produced by another version.

For instance, if version A got 9 plies, and version B got 8 plies, it
will tell me this, and it will also tell me how long each of them took
to get 8 plies, and will produce a ratio expressed as a percentage.

It accumulates some of this information.  In a two-problem suite, if
both A and B got through 8 plies in the first problem, and it took A 12
seconds and B 14 seconds, and both got through ply 12 in the second
problem, A taking 22 seconds and B taking 24 seconds, I'll total this
up, and get 34 seconds for A and 38 seconds for B.  I'll output both of
these numbers and the ratio between them, and conclude that B is 12%
slower than A.

This is not perfect.  I should probably normalize the numbers somehow so
that problems in which both of the programs finish very close to the
maximum allowable times don't get more weight than those in which
neither of them can quite finish that last ply.  Also, this doesn't take
into account that one of the versions might be getting closer to the
real answer, and therefore is taking more time per ply.  And finally, I
have had a problem with disk caching -- the second run on any given
night usually goes faster than the first one, so when I run these
suites, some of the results are a little bogus.

But even so, I can use this tool.  If I want to examine the effects of
some move ordering change, I can run a lage suite for a reasonable
amount of time per problem, and get a number that represents a rough
guess about whether I made things better or worse.

There is another benefit as well.  This program looks at the node counts
taken to complete every ply, and it compares them, and outputs *any*
differences between the two versions in this respect.  So if all I did
was try to make the program go a little faster, without changing any
semantics, I can easily look for node count changes between two
versions, which are almost certainly bugs.

So, to summarize this tool, it lets me get a rough idea if I've sped the
program up, and lets me find bugs if I have tried to make a performance
change.

I have another tool that lets me evaluate the results of tactical
suites.  I can "grep" my log files for the string "Success", which will
let me know how many problems I've gotten right with version B, and
compare these number with the number of problems I got right with
version A.

I'm sure a lot of people do this, but I've taken it a step further.  I
have a tool that will tell me how many were right after 1 second, 2
seconds, etc., out to the duration of the test run.  I can look at this
with my eyeballs or I can chart it with Excel.  The shape of this curve
is very interesting.  Often, two versions will solve the same number,
but if you look at a chart of these intermediate results, you will be
able to choose between them, because one will get a tremendous lead, and
the other will slowly catch up until at the end they are equal.  In this
case, the  one that gets answers faster is better.

I have another tool which does nothing but output time to solve for each
problem for each version.  I can load this into Excel and chart this,
and notice things like one program solving problem 374 in 1 second while
the other one solves it in 38 seconds.  Sometimes there is an
interesting explanation for this.

So, if you had built these tools for yourself, and were willing to make
a lot of sub-versions that you could test every day (I usually make 3 or
4 versions a day), you could evaluate the effect of these extensions,
together and in isolation.

I get the idea that some people target a specific problem, and if they
can mess with things until they get that problem faster, they call it
good.  I don't do this.  I would much rather miss one problem than solve
everything else 25% slower.  So *every* time I mess with search
extensions or pruning I run one of these suites and try to understand
what the change actually did to overall ability to solve tactical
problems.  If something looks good, I'll run more suites the next day,
just to make sure.

I can't understand how anyone can survive without these tools, actually.

Do you get more correct answers in the same time when you add all of
your extensions in?  Do any of them make your scores increase if you
remove them?  Can you mess with specific extensions to make them go
faster without losing solutions?

bruce



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.