Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: More on the "bad math" after an important email...

Author: Robert Hyatt
Date: 12:36:34 09/06/02
On September 06, 2002 at 13:29:11, Vincent Diepeveen wrote:

>On September 05, 2002 at 07:23:03, Georg v. Zimmermann wrote:
>
>In the quoted case such as enron is
>that when we talk about money and something goes wrong there are many
>instances who keep themselves busy with such a problem.
>
>Lawyers, prosecuters, financial guru's, government officials etcetera.
>
>A major problem is that in science we usually talk about something where
>there are only a hand full of specialists.
>
>When we talk about parallel search there is perhaps 20 people world wide
>who know the difference between using an smp_lock and not using it with
>regard to a formula which gives a lineair speedup for crafty at n processors.
>
>Usually there is no good control over scientists. Bob isn't the first
>to invent numbers. Of course he did it very radical. He says the maximum
>is 2.0 for his number (which is based upon a loaded hashtable, so
>not a non deterministic problem already occurs even for single cpu
>results) and he claims 2.0; rounded off or not it is a claim of 2.0 out of 2.0.
>
>The sad thing is of course that from these 20 persons, not a single one
>is good in statistics (me included). In fact some even challenge all
>statistics. The few publications done on the parallel search terrain
>are all real pathetic from statistical viewpoint.
>
>person a showing a few 5 ply searches versus person b showing only 24
>searches from a simplistic game (mchesspro-cray blitz).


The thing you don't understand is "background circumstances".

Today I can run as many tests as I want.  I can run 'em 9 at a time on my
nine quad 550's.  Ten years ago I was lucky to get time to play in a tournament.
Time to test was nearly impossible.  Always from midnight to 6am, and in a good
year one such night per month.  Cray sold their spare computing cycles to
companies such as shell, or lucasfilm, or whatever.  So anything they gave me,
while not directly costing them in cash outlay, it directly cost them in lost
revenue for the sale of that time.

Contrast that with today.  I don't care whether you try to compare with CB's
numbers or not.  I am not sure _I_ will try that kind of test again, as it is
far easier to just smash thru a test suite.  But the data was interesting, it
was important to me for reasons I have given, and I did it that way to answer
the question I was posing.




>
>Even though 24 is already much better than the previous researches it still
>is statistically spoken very insignificant, because it is a game which
>soon gets to a draw and then loss for white, so a win for black.

what does that have to do with anything.  That makes it a _better_ test.
Positions where you fail high when you find something good.  Positions where
you fail low as you notice something bad.  That is a very good cross-section
of what happens in a chess tournament.  I could have picked _any_ game we
played that year, but that was the one with the most dynamics in it.




>
>As some may realize it is hard to estimate possible positive or negative
>effects of seeing a draw score onto speedups of a program.

So?  You _must_ deal with them in a game.  Unless you expect to win or lose
every game you play.  And even when winning you will encounter draw scores and
have to handle them to avoid them...

>
>There are other effects such as pondering and loading factor of the
>hashtable, and not clearing it.

Certainly there are.  But that is how _I_ play a game with any of my
chess programs since I first started using hashing.  So it is an important
aspect of game play, and had to be included in the test or it wouldn't have
answered the question I posed.  I would, instead, be answering "it gets a
speedup of XX.X, but without pondering, so I am not sure whether that will
help or hurt in a game."

I chose to eliminate that problem.




>
>Many new effects in a research which aren't independable compared.
>
>So not a single speedup number is comparable with other research.

So?  When you run your tests to compare against (say) Crafty.  Do you
use the same hash size?  I run most tests with something very small.
Can we compare?  maybe or maybe not.  There are _always_ independent
variables that influence every experimental setup unless the _same_
program version is used for all tests.  Then you can compare that
program against itself, but that is all.





>
>Other researches again have the problem that they compare themselves
>with dead slow programs, or programs which are completely different
>(for example i remember a research of Feldmann who compared
>a program A with very small hashtable versus a program B with big
>hashtable, where all researchers know very well that a very small
>hashtable is deadly for parallel speedup. In short a big hashtable
>is a big advantage.

I would say "advantage".  Not necessarily a "big advantage".  It certainly
helps, and for my program, varying the size of the hash table can make the
serial program run up to maybe 2x faster in middlegames.  2x isn't huge,
although it inflates parallel speedup numbers.  Several of us pointed this
out when he presented the paper at one of the ACM events.  It was a flaw
that he may or may not have realized before he did it.  I certainly didn't
jump up and say "that's a fraud".  I stood up, asked a question that pointed
out a mistake, and let them answer (the waycool guys were doing the presentation
for their speedup I don't remember if Feldman was involved with that or not).




>
>there is no clear standard in parallel research in short and
>none of the published scientists, with exceptions perhaps of
>Jonathan Schaeffer, seems to care much about it either.


Jonathan used kopec.  I used kopec in every speedup I published, until
I did the JICCA paper.  My original DTS dissertation is _only_ kopec
for the chess part of the results.  I think that is a bit unhealthy,
but it did connect it to other results.




>
>I have heart when talking to different programmers who have a parallel
>version, major criticism against results as published. I hope some
>people who read this very well realize that i'm not only speaking for
>myself here.
>
>I get daily email of those programmers who themselves will never post
>such things here, but completely agree with the data as published.
>
>Very well known but hardly published is of course the biggest criticism
>which is already posted more by me and others too, something bob clearly
>did NOT do.
>
>The Criticism has regard to that some programs in order to show a better
>speedup were first completely raped and slowed down by magnitudes of factors,
>before speedups were 'measured' or 'guessed' or 'extrapolated' even.

Don't follow.  Cray Blitz was not "raped, slowed down, etc." at all.  Neither
was Crafty.  The original rewrite cost me maybe 5%, but this went down as the
compiler got better and I learned how to set the optimizer options a bit
better.  If that is what you are talking about.

>
>Obviously some things are hardware dependant. The alpha clusters where
>zugzwang ran on (forgive me if it is other machines too) as well as
>the sun hardware where Cilkchess ran on, it obviously is harder to communicate
>there than at shared memory machines (and as we know there is difference
>there too).
>
>It doesn't take away that slowing down a program tens of times, that this
>isn't a very good idea. Nevertheless the speedup looks very good.

Why?


>
>Such things are simply not provable other than knowing what type of
>program it is and knowing how fast similar programs can run on such
>processors.
>
>Let me give you for example the different cilkchess versions:
>  - single cpu cilkchess: 5000-10000 nodes a second
>  - single cpu other program (non cilk) from Don: 200000 nodes
>    a second. Of course it had no eval, but even then it'll get
>    100k nps.
>
>Zugzwang was around 5000 nodes a second a cpu if i understand well.
>Gnuchess, very well comparable in speed with zugzwang at the same
>type of cpu's is (if not using 16 bits code but 64 bits code)
>considerable faster. 100k nodes a second?
>
>Of course this is all lineair slowdown.

Fine.  But for me, Cray Blitz ran at xK nodes per second on one cpu, 2xK
on two, etc.  I didn't have that problem.  The C90 ran at 500K on 16 cpus
so you can figure out the speed of one.  A ymp ran at about 160-200K on
8 cpus (slower clock).  YOu could figure out the one cpu time and get a
perfect match.



>
>However suppose i have at my pc a diep version getting 80k nps
>a second at 1.6Ghz processor.
>
>It's of course complete swindling IMHO to let it run at a 1.6Ghz
>supercomputer processor which even is 64 bits at say 8k nps single
>cpu there.
>
>The best compare is the R14000 SGI chips. Single cpu i have been
>busy *optimizing* the speed of diep for it the past 2 months. Though
>not entirely finished yet, it gets single cpu at P2 about 25000
>nodes a second without locking with 400MB hash.
>
>That's with a very small hashtable.
>Of course it will be a bit slower when using a big hashtable,
>because that's memory from other nodes than this one.
>
>Now suppose i use in all tests something which fits at the local
>memory of 1 node (which is 2 GB).
>
>So suppose i run it at 30 processors out of 32 and do not get
>for the parallel version 25000 nodes a second, but like 10 to
>20 times less so say 1250-2500 nodes a second and do all my
>comparisions about the speedup i get with that 1250-2500
>nodes a second.
>
>Bob *never* did this.
>

Bob never had that architectural problem _either_.  So how _could_ I do
that?  My quads are not NUMA.  The Cray was a full crossbar, not NUMA...



>Majority of the others did however.
>
>I sure won't do it. I in fact don't even lose system time
>to being multithreaded. I'm multiprocessor which is ideal at
>such machines, because operations from node to node is more expensive
>than at PCs.
>
>In fact the old diep version already got about 500000 nodes a second
>when i ran it at a few processors. However it took long to get that
>speed. So for the latest version i have an additional CONDITION which
>i am working. And that's that do not only want to get a great
>speed out of it, i also want to get that speed within 90 seconds.
>
>I do not know what these scientists feel they got paid for by their
>universities, but i sure know i would not have paid them for their
>results. If i would've been their boss, i would've fired them.
>
>Nevertheless the mentality at universities is the opposite of this
>mentality.
>
>At computerchesstournaments the programmers there do not for nothing
>refer sometimes to 'universitychessprograms'. If they do, it isn't
>meant positively.


That may be _your_ perception.  When I was playing, it was not an issue.
"university programs" were feared, not hated.



>
>As i said cray blitz with respect to its speed definitely doesn't
>belong in that category, but with respect to all facts posted i definitely
>doubt it ever was good using the standards i refer to.

I don't understand the standards you want, so it is impossible to
comment...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.