Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: listing a few beginner bugs in Omids 'research'

Author: Vincent Diepeveen
Date: 20:46:06 12/17/02
On December 17, 2002 at 19:54:06, Uri Blass wrote:

>On December 17, 2002 at 19:16:26, Vincent Diepeveen wrote:
>
>>On December 17, 2002 at 18:21:28, Uri Blass wrote:
>>
>>>On December 17, 2002 at 18:11:20, Vincent Diepeveen wrote:
>>>
>>>>On December 17, 2002 at 17:30:36, Bruce Moreland wrote:
>>>>
>>>>if you go back in time a bit you see that i had
>>>>major problems with Omids article and posted it here.
>>>>
>>>>there is more than just the problems you see there.
>>>>
>>>>also look at his homepage and get the positions he tested
>>>>and then look to his node counts. for a mate in 2 position
>>>>where i need like a couple of hundreds of nodes to get to 10 ply
>>>>he needs 10 million nodes. then R=3 reduces that more.
>>>>
>>>>also his implementation is buggy of course. it doesn't take into
>>>>account problems with transpositions. a classical beginners problem.
>>>>
>>>>But most important is that verification search is not something new
>>>>it is a buggy implementation of something already described years ago
>>>>with only 'novelty' that omid turns off nullmove *completely*
>>>>after he finds a nullmove failure.
>>>
>>>No he does not.
>>>There is no point in the tree that he turns off nullmove completely.
>>>
>>>>
>>>>All with all a very sad article. The only good thing about it is
>>>>the quantity of tests done.
>>>>
>>>>The test methods and the implementation and the conclusions are
>>>>grammar school level.
>>>>
>>>>I do not know who proofread it, but it gotta be idiots or people who
>>>>didn't care at all.
>>>>
>>>>Amazingly Bob defended Omid here and said nothing was wrong with
>>>>the article.
>>>
>>>Bob also found that verification search is good for crafty based on his post.
>>>Bob is not the onlyone who defended Omid.
>>>
>>>I also defend him and you are the only poster who attacks him(even posters who
>>>said that it did not work for them did not say that it is very bad).
>>>Most of what you say is not correct.
>>>
>>>Uri
>>
>>You are dreaming.
>>
>>Ok to list a few bugs in his paper:
>>
>>  a) all his test positions are mates and he doesn't do
>>     checks in qsearch so R=2 versus R=3 matters a lot because
>>     of the extra ply you miss of the main search finding the
>>     mate for you. So if your qsearch is so buggy it is logical
>>     that R=2 works at depth==9 better than R=3 at depth==9,
>>     this is *trivial*. It is so trivial that no serious researcher
>>     should compare the same plydepths with each other without
>>     taking into account time.
>
>not all his test positions are mates
>I believe that most of the test positions are not mates.

vaste majority are. i don't know how well your thing is in
finding mates, but with DIEP i see each position a mate score
within seconds when i tested all the positions.

>The fact that R=2 works at depth=9 better than R=3 at depth=9 is not the point
>of the research.

the whole working of the 'verification' search is based upon
that comparision. the other comparisions are  done to talk around it.

>
>>
>>     Because we are going to conclude that minimax search is better
>>     than alfabeta for sure.
>>
>>  b) now already assuming the bug in his comparision is there to
>>     compare depth==9 with depth==9 instead of the factor time,
>>     then the bug is that he is just testing mates so reduction
>>     factor matters. It is trivial to try adaptive nullmove then.
>>
>>  c) there is major bugs in the program Genesis looking at the
>>     branching factor differences between R=1, R=2 and R=3.
>>     I do not know a single serious chess program that has
>>     such a difference.
>
>Genesis is not his serious program but it does not mean that there are bugs in
>it.

the branching factor of genesis is too huge (with nullmove R=2 i mean)
to take it very serious.

>>
>>  d) Genesis needs way too much nodes to get to a decent plydepth
>>     when compared to even programs doing checks in their qsearch
>>     and extensions in nominal search. For mate in 2 he needs like
>>     10 million nodes to get to depth == 10.
>
>You look at the wrong positions
>The time that it needs to get depth 10 in mates in 2 are not interesting and can
>be ignored but there are enough positions that are not forced mates.

It shows the principle of Genesis needing too many nodes for
such positions. There is not difference in branching number of nodes
needed for genesis for a very complicated position versus a position
with a very simple cutoff idea (mate in 2 for example) which only
burns a few nodes.

If the average number of moves and tactics is big, then in
general i get a much bigger branching factor in DIEP than
when the number of moves and tactics possible is very small.

>>
>>  e) illegal position in his testset.
>
>He replied that this was simply old positins in his site and he did not use the
>illegal position.

It's encouraging to know how accurate his testset was chosen then.

>>  f) his algorithm is not new. It is a rewrite of something already
>>     existing and he rewrote it wrong. He has a bug in his verification
>>     search. You can easily proof it by using transpositions.
>
>I do not know what you talk about.

You know very well what i mean. It has been said a lot of times before.

>>  g) It won't detect a zugzwang for sure simply because of transposition
>>     bugs. Therefore the only claim can be that it is enhancing tactical
>>     abilities of his program.
>
>It can detect zugzwang but only if there is not more than one zugzwang in the
>same line.

If you aren't storing in hashtable whether a verifiation has been
done already and taking that into account for transpositions, then it
is *trivial* to proof that you can and will miss zugzwangs with this
algorithm.

just first get into a position where a verification is getting done,
then at a position X do a search with nullmove (and no possibility
to verify anymore). Store that in hashtable.

Somewhere else before a verification has been done you get a transposition
to that position X and there the damage is done!

> Therefore it is crucial to also test
>>     different forms of adaptive nullmove (with different depths to
>>     go from R=3 to R=2).
>>
>>  h) It is unclear why he concluded verification search is better using
>>     his own data.
>>       a) more Fullwidth search than verification
>>          clearly finds more positions.
>
>he did not compare verification search with R=3 but only verification search
>with R=2.
>There is no conclusion if verification R=3 is better than R=3 based on the
>article.
>The games that he did were against R=2 and not against R=3.

That's exactly the weak spot of the implementation. If you implement
something which practically is seeing tactically about the same
as adaptive nullmove, then you better compare against it.

Also a good compare would be an adaptive nullmove where you use
R=1 the last 3 plies,
R=2 the last 4 plies,
R=3 the last 5 plies and bigger depths left.

So in principle always doing a full search of 1 ply to try to get
a fail low.

That's missing a lot less that last few plies before the qsearch,
which of course covers up possible major bugs in ones qsearch a lot.

>>
>>       b) R=3 uses less nodes than his verification search.
>>
>>     it is very unclear how he concludes then that verification search
>>     is better. It's topping nowhere a list of 'this works better'.
>
>
>>
>>   i) Even from where i sit and without having genesis i can already smell
>>      that adaptive nullmove works better than his own verification search.
>>      his beloved verification search you can easily write down what it is
>>      doing on paper. It's simply avoiding to nullmove last few plies
>>      initially. So there is always a form of adaptive nullmove that is
>>      going to completely outgun it simply.
>
>It does not avoid nullmove and it is using always R=3 after null in the search.

I know, but if you write down what happens you will see that it
basically will replace the search by a 1 ply search the last few plies
before the root.

Adaptive nullmove is closer to that than R=3 is. Of course others
who use adaptive nullmove (i'm always using R=3) directly saw that
and posted here that he should compare versus adaptive nullmove. I
feel that this is 100% the truth.

>If you use always R=3 after null then you can expect to use R=3 in most of the
>tree when you search deep enough.

Not really, because you assume the case where you do the verification
near the root. However search doesn't work like that.

Verification is doing countless of 1 ply searches near the leafs where
otherwise the qsearch replaces the cutoff.

Doing a 1 ply search there all the time which R=3 isn't doing nor R=2
is doing, that's very expensive compared to just a qsearch.

Of course a lot of that expensive searching is getting pruned away
by transpositions which are caused by bugs becuase it doesn't store
whether a cutoff already did a verification or not. So usually the
answer is 'not'. You get a lot of transpositions when a verification
isn't done yet to a tree where a verification wasn't done.

If you 'bugfix' that, you'll see that the tree gets a lot bigger.

Especially at the search depths which we get at tournament levels.

So not the 9 and 10 ply blitz levels, but the 12-15 ply search depths
which we get at tournament level. There the price of that is extremely
big. But at 9 and 10 ply you can already measure this very clearly.

>>   j) the testset is just basically mating positions in all tests.
>>      that's a very dangerous ground to conclude things.
>>
>>Note that at depth == 10 ply i solve with diep with R=3 i solve far more
>>positions in way less nodes than this guy ever will. This testset is just
>>too simple.
>
>Your comparison is wrong.
>genesis is not a top program and omid never claimed that it is a top program.
>If you have better qsearch and better order of moves and better extensions you
>may solve more positions with less nodes.

You get down to a very important point. Bugs in the qsearch. I won't claim
that diep's qsearch is perfect. In fact it's a huge qsearch wasting nodes.

But something in me tells me that Genesis has major bugs in qsearch.

I remember that very clearly in an experimental program of mine where i tried
to search very quickly with. So i had to make a very primitif qsearch.

Just a few recaptures i did. The impact of the quality of the code
was *huge*. Plies difference!

It's here where i bet there is a few major bugs in genesis. I definitely
want to touch the subject qsearch a lot here, because if you look to the
verification search you'll see that it is basically busy doing 1 ply searches
near the leafs in order to get a cutoff.

>I also remember that genesis does not do check extensions except the last ply
>but it proves nothing.

you mean in the main search of genesis there is no check evasion extensions?

that there is not checks in the qsearch is trivial to me. that's hard to
get to work properly. Bob didn't manage even despite that he tried some.

>Part of the programmers of good amateur found that the idea is productive for
>them(I do not say that it is productive for everyone but I do not see the point
>of attacking genesis when Omid has a better program that is not public).
>
>It is better in the future to use also other programs before publishing but Bob
>found that it is also good for crafty in test suites and games.

Just using other programs is not the solution. the solution for this
is to use a good way of comparing things.

i initially even missed that he didn't compare against adaptive nullmove.

The important thing is that he compares apples with oranges.

Compare search times for me. I will shut up for at least 5 years as you
see with the research of Hyatt, if you produce something that compares
fixed search times against fixed search times.

Comparing search times is the only reasonable way to measure things.

You can create wonderful graphs from the time needed to find N solutions
versus the search depths.

You can conclude anything from these graphs which will not be swallowed
soon by everyone, but they won't get mad.

It is really the years 80s that people got away comparing search depths.

I remember very clearly zugzwang speedups which were measured in searchdepths.

Imagine that i write a special diep version which just compares 15 ply
with 15 ply searches without taking into account time.

I am 100% convinced that i would then conclude that my old pre world champs
2002 version is working better at 32 processors than my current version.

For your info, to get that 18 ply in middlegame with 32 processors the
pre world champs 2002 version needs for sure 30 minutes longer, as it
starts at only a few kilonodes a second and very slowly gets more and more
and more, because all processors eat bandwidth and there is a global
locking.

But if i just look to how many nodes i need at that depth, it will look
like a GREAT speedup.

The new version however might get that 15 ply within a minute or so
but for sure needs more nodes to get there as i am to start with also
dividing the tree different (and for sure less efficient),
just in order to keep all cpu's busy.

This where the splitting i have for 2-4 processors is using a lot
of extra conditions which of course prevent it from splitting sometimes
for quite a while somewhere. Something you can't do at 32 processors,
because 31 are idling then for a long period of time.

Now i had hoped that all these dinosaurs were slowly gone from computerchess,
and then we get such an article level Omid posted.

That is real sad.

I would even have swallowed the bugs in the qsearch, had he compared
search times with each other.

However he didn't. And i know why.

>Uri
Re: listing a few beginner bugs in Omids 'research' Robert Hyatt 13:45:50 12/18/02
Re: listing a few beginner bugs in Omids 'research' Uri Blass 23:26:03 12/17/02
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.