Author: Peter Fendrich
Date: 12:09:16 09/28/04
Go up one level in this thread
On September 28, 2004 at 13:42:33, Stuart Cracraft wrote:
>On September 28, 2004 at 09:06:21, Peter Fendrich wrote:
>
>>On September 28, 2004 at 08:22:20, Stuart Cracraft wrote:
>>
>>>On September 28, 2004 at 05:17:26, Peter Fendrich wrote:
>>>
>>>>On September 27, 2004 at 23:45:54, Stuart Cracraft wrote:
>>>>
>>>>>I experimented with reordering root ply at iterative depth iply > 1
>>>>>where 1 is the root ply, with the results of iply-1 sorted by the
>>>>>total nodes of quiescence and main search defined as the # of entries
>>>>>for each of those subroutines.
>>>>>
>>>>>I didn't sort at root node on the first sort by quiescence but instead
>>>>>by my normal scheme though I tried quiescence and it was worse. I felt
>>>>>this gave a better chance to the above method.
>>>>>
>>>>>I sorted moves at the root ply for iply > 1 in the following way
>>>>>for 7 different parts to the experiment.
>>>>>
>>>>> sort by normal method (history heuristic, mvv/lva, see, etc.
>>>>> sort exactly by subtree node count, nothing else
>>>>> sort by subtree node count added to normal score (hh, mvv/lva, see, etc.)
>>>>> same as previous but node count x 10 before addition
>>>>> same as previous but node count x 100 before addition
>>>>> same as previous but node count x 1000 before addition
>>>>> same as previous but node count x 10000 before addition
>>>>>
>>>>>The results, measured by # right on Win-at-Chess varied from
>>>>>250 for the first in the list above to 234 for the last.
>>>>>Most bunched up between 244-247 except the first was 250,
>>>>>my current best on WAC with handtuning everything.
>>>>>
>>>>>For me, I'm convinced that this style of sorting root ply is
>>>>>slightly less good for my short searches compared to what I am using:
>>>>>a combination of history, heuristic, see(), and centrality with
>>>>>various bonuses, about a half page of code sprinkled about.
>>>>>
>>>>>The advantage of sorting root node by subtree is the simplicity.
>>>>>It eliminates about a half a page of code and introduces
>>>>>about a quarter page of code for only slightly lesser results
>>>>>(within 1-2% of my current result) so that is good.
>>>>>
>>>>>Still I think I'll leave it #ifdefed out for now and use it as
>>>>>a baseline that is only improvable upon with handtuning of my
>>>>>current methods and others to be discovered.
>>>>>
>>>>>Stuart
>>>>
>>>>I've noticed that you often refer to WAC and also do very quick searches.
>>>>If you get 247 in one test and 250 in another that doesn't mean a thing
>>>>if you don't examine the positions that changed. Very often you will find that
>>>>the difference is due to random coincidents.
>>>>I'm sure that you could get such differences just by making some innocent change
>>>>somewhere in the code...
>>>>There will always be some part of your move ordering (in the tree) that is
>>>>random and the same goes for what positions and moves that happens to stay in
>>>>the hash table, killer- and history-lists.
>>>>/Peter
>>>
>>>A variance of 1% is a non-issue although a questionmark. A
>>>variance of 2% is probably an issue.
>>>
>>>My point was that with hand-tuning I achieved X and with
>>>root move ordering by node count I got .99X or .98X.
>>>
>>>I don't think I've ever varied 2% based on the vicissitudes
>>>of the machine or phase of the moon.
>>>
>>>Note: I don't use random numbers in my machine except to setup the
>>>Zobrist hash tables but in that case I always generate the exact
>>>same sequences of hash codes so my program is deterministic from
>>>run to run. That is the only place that "random" ise used.
>>>
>>>In any given run I can be pretty sure that my variance is 1% max.
>>>
>>>Stuart
>>
>>I think that you missed my point.
>>You're program is deterministic from run to run, that's fine.
>>What I'm talking about is what's happening between changes in the program.
>>I have done this several times and really studied why the test solved lets say 2
>>more positions. Just pick the 2 solved (or not solved) positions and take a look
>>and you will see what I mean. It's easier if you can print out the tree. Often
>>there are more or less random reasons when time is vey short. For instance lets
>>say that in a situation somewhere in the tree you could have both Qf6 and Qh6 as
>>killer moves and you would get the one that was moved first. Qh6 will eventually
>>lead to a mate, Qf6 will not. This is not known at the time you save the killer.
>>Now if you happened to have Qh6 as a killer you would find the mate a little bit
>>quicker but not because your program knew that Qh6 resulted in a mate but
>>because som random condition. If you make a change that suddenly puts Qh6 as the
>>killer the program takes a little bit longer time and maybe too long for your
>>time settings. It will maybe get a worse result for the "wrong" reason.
>>This is the random variation that I meant.
>>
>>That's one reason why I never rely on a test set only.
>>
>>/Peter
>
>Can you fully describe your entire test methodology?
I have one that I would like to do and one that I really do because I'm too
impatient. Nothing new I think, no short cuts.
I have selected a couple of engines that I play a gauntlet against.
They are a little bit better, a little bit worse and equal to Terra in strength.
I don't download upgrades of these engines because I want to keep them stable in
order to keep track of progress. It is about 300 games every time.
After a year or so I select a new bunch of engines.
There are a few considerations:
- Turn off learning (if that's not what I would like to test)
- Turn off books or use very small wide books. (if books aren't the issue)
In Arena I can select random book moves for both parties.
Lately I've tried to run the Nunn positions instead.
- Keep EGTB's (if endgame logic isn't the issue)
Before doing these matches I run some testset or play some fast games but that's
in order to quickly catch plain bugs.
If I especially develop endgame logic I will run endgame testsets. That is soon
to happen in fact.
If I especially would focus tactics I would in addition to games run tactical
testsets like WAC and others. I've almost never done that however.
...and so on...
Even if I would like to develop a bullet king and only make short time tests I
wouldn't rely on WAC.
You can compare different engines results on WAC and see that the correlation to
their relative strength isn't especially high.
/Peter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.