Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Junior's long lines: more data about this....

Author: Don Dailey

Date: 22:45:12 12/31/97

Go up one level in this thread


>>Here is how I do SELF testing and some of my thoughts on it.
>>
>>I have 200 very shallow openings and they go to exactly 5 moves for
>>each player or 10 ply.  Larry Kaufman picked them so that they are
>>all relatively normal, span a lot of theory but do not get you too
>>deeply into the game (so we test opening heurstics too.)
>>
>>200 openings gives us a total of 400 games, so we try to test in
>>batches of 400 games.
>
>How long does a 400 games match takes to get accurate results?

I often start with very fast games so I can get lots of results very
quickly.   Often a new algorithm will get badly beaten and it will
make no real sense to continue.  But when everything seems ok and not
too one sided I migrate to longer and longer games.

Here is an interesting topic we should discuss:  How meaningful is
it to test at much faster time controls that actual tournament time
contols?  Here is my sense of this subject:

At one time I believed it to be very important to do lots of testing
at tournament time controls, after all, that is what you are trying
to optimize.   But my opinion on this now is that it is (mostly) a
waste of time.   On virtually every test I ever do, I get the same
results on average.   I will test levels that vary from 1/2 sec or
less per move up to 2 or 3 minutes per move and the better
algorithm tends to win at ALL levels with about an equal ratio.  The
only trend I have every noticed is that really tactical algorithms
like wild move extensions will tend to do much better on really low
levels.  But after 3 or 4 ply it does not seem to make a bit of
difference.  I have yet to find an algorithm that scores better at
fast time and worse at long (or visa versa the other way around.)
Even with the wild tests it was more like 55% at 2 and 3 play, 51%
at everything higher!

Now there are a few that SEEM to have this behaviour until I check
it out more thoroughly and get bigger sample sizes and test at even
higher levels.  It always turns out to be statistical noise I saw.

So my current belief is that you need to search deep enough to get
beyond about 3 ply and after this it will not matter much at all.
Even if there is a very slight effect it's cancelled out by the
small sample sizes you are limited to with long games.  If I had
to test at tournament time controls it would take 2 months to get
in 300 or 400 games.  This is ridiculous and I don't consider it
enough games anyway.  I feel like its a fortunate accident that
this kind of testing may not be all that necessary.

Occasionally I "bite the bullet" and go for some longer testing
but it has yet to show me something I couldn't see with the short
testing.   Larry Kaufman once believed (I assume he still does)
that game in 10 or 15 was a very good compromise because he felt
it simulated tournament time controls well, you get most of the
depth (about 2 ply less) with a much reduced investment of time.
Even if there is an odd/even effect it should be the same at
about that time control.  He used to test machines using that
time control.


-- Don



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.