Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: supercomputers and their future for gameplaying

Author: Uri Blass

Date: 11:57:08 12/23/03

Go up one level in this thread


On December 23, 2003 at 12:11:26, Vincent Diepeveen wrote:

>On December 23, 2003 at 11:22:08, Bob Durrett wrote:
>
>>On December 22, 2003 at 22:42:17, Vincent Diepeveen wrote:
>>
>>>On December 22, 2003 at 21:57:07, Vincent Diepeveen wrote:
>>>
>>>Because my report will take a lot of time and will be very extensive it might be
>>>interesting to share 1 thing which in fact amazed me a lot by the clearness of
>>>the results.
>>>
>>>That is that projects like chessbrain are completely doomed to work in realtime.
>>>*completely* doomed.
>>>
>>>I will show some data here.
>>>The question i asked myself was next: it is trivial that cpu's get faster and
>>>faster each year. Yet latency from processor to processor doesn't keep up. This
>>>is a thing that's trivially shown by several researchers already. Under this
>>>assumption it is trivial then that the hashtable will not be capable of getting
>>>shared. So one must use a local hashtable last few plies. A very clear case of
>>>when this has to happen is when one searches in a distributed project (like
>>>chessbrain).
>>>
>>>In diep there is a parameter what depth to store in global hashtable (and
>>>retrieve) and what to store local. The tournament version used did store all
>>>internal nodes into the hashtable, but the quiescencesearch was stored onto
>>>First the normal output at the position:
>>>
>>>Please remember i took a position which very soon showed itself as a worst case
>>>position and the hashtables were cleaned when starting to search at it. That's
>>>why i used it.
>>>
>>>First the normal diep version:
>>>        00:00      0 0 0 (460) 1 (0,0) 2.119 b4xc5
>>>        00:00      0 0 0 (460) 2 (0,0) 2.119 b4xc5 Qc7xc5
>>>        00:02      0 0 0 (460) 3 (0,0) 2.237 b4xc5 Ra8-e8 Nc3-e4 Bd6xc5
>>>        00:02      0 0 0 (460) 4 (0,0) 2.237 b4xc5 Ra8-e8 Nc3-e4 Bd6xc5
>>>        00:40      0 0 0 (460) 5 (0,0) 1.883 b4xc5 Rf8-e8 Nc3-e2 Qc7-a5 c2-c3
>>>        00:45      0 0 0 (460) 6 (0,0) 1.846 b4xc5 Ra8-e8 Nc3-e4 Ne7-f5 Nf3-g5
>>>Bd6xc5
>>>        00:53      0 0 0 (460) 7 (0,0) 2.406 b4xc5 Rf8-e8 Nc3-e4 Ne7-f5 Nf3-g5
>>>Qc7xc5 c2-c3
>>>        00:56      0 0 0 (460) 8 (0,0) 1.739 b4xc5 Rf8-e8 Nc3-e4 Ne7-f5 Nf3-g5
>>>Qc7xc5 Ra1-b1 Nf5-d4
>>>        01:07      0 0 0 (460) 9 (0,0) 1.970 b4xc5 Rf8-e8 Nc3-e4 Ne7-f5 Nf3-g5
>>>Qc7xc5 Qd1-d2 Nf5-d4 c2-c3
>>>        01:34 563556 53024985 0 (460) 10 (15969,151439) 1.204 b4xc5 Rf8-e8
>>>Nf3-g5 Ne7-f5 Nc3-e4 Bd6xc5 Qd1-g4 Nf5-d6 Ng5-e6 Re8xe6 d5xe6
>>>depth=10 nps=607055 ttl=72846666 egtb=0 aborted=48692678
>>>(afh=25543,splits=261134) time=120.00s
>>>
>>>        03:08 758373 142589422 0 (460) 11 (47875,499297) 1.095 b4xc5 Rf8-e8
>>>Nc3-e2 Qc7xc5 c2-c4 Ba6xc4 Ba2xc4 Qc5xc4 Qd1-d4 Qc4xd5 O-O
>>>depth=11 nps=813654 ttl=182877066 egtb=0 aborted=112295422
>>>(afh=67685,splits=693612) time=224.76s
>>>
>>>        03:58 908254 216881978 0 (460) 12 (78112,841881) 1.050 b4xc5 Rf8-e8
>>>Nc3-e2 Qc7xc5 c2-c4 Ba6xc4 Ba2xc4 Qc5xc4 Qd1-d4 Qc4-c2 Qd4-c3 Qc2-e4
>>>depth=12 nps=957532 ttl=386182356 egtb=0 aborted=241106908
>>>(afh=127633,splits=1400567) time=403.31s
>>>
>>>        09:36 1298189 748172751 0 (460) 13 (245417,2398027) 0.444 b4xc5 Rf8-e8
>>>Nc3-e2 Qc7xc5 c2-c4 Ne7xd5 Qd1-d4 Qc5xd4 Nf3xd4 Nd5-c3 Ke1-f2 Bd6-c5 Kf2-f3
>>>depth=13 nps=3168138 ttl=3095778486 egtb=0 aborted=1759560355
>>>(afh=1258955,splits=8177463) time=977.16s
>>>
>>>        19:20 3364631 3906067861 0 (460) 14 (1603537,10670458) 0.117 b4xc5
>>>Rf8-e8 Nc3-e2 Qc7xc5 c2-c4 Ne7xd5 Qd1-d4 Qc5xd4 Nf3xd4 Nd5-c3 a3-a4 f4-f3
>>>        50:54 5317603 16243842977 0 (460) 14 (5403791,33550882) 0.212 b4-b5
>>>Ba6-b7 Qd1-d2 Ne7-f5 O-O Nf5-e3 Rf1-f2 Rf8-e8 Qd2-d4 Ra8-c8 Bc1xe3 f4xe3 Rf2-e2
>>>Nc5-d3
>>>depth=14 nps=5015287 ttl=17096414590 egtb=0 aborted=9662055619
>>>(afh=5723154,splits=36131262) time=3408.86s
>>>
>>>        01:00:31 5064710 18392394781 0 (460) 15 (6077444,39451488) 0.264 b4-b5
>>>Ba6-b7 Qd1-d2 Ne7-f5 O-O Nf5-e3 Rf1-f2 f7-f5 Qd2-d4 Ra8-e8 Ba2-c4 Nc5-e4 Bc1xe3
>>>f4xe3 a3-a4
>>>depth=15 nps=4294115 ttl=20628201297 egtb=0 aborted=11380590805
>>>(afh=6773788,splits=45841476) time=4803.83s
>>>
>>>        01:31:30 4466087 24521993032 0 (460) 16 (7723311,52786168) 0.160 b4-b5
>>>Ba6-b7 Qd1-d2 Ne7-f5 O-O Nf5-e3 Rf1-f2 f7-f5 Nf3-g5 Qc7-b6 Qd2-d4 h7-h6 Ng5-e6
>>>Nc5xe6 Qd4xb6 a7xb6
>>>        01:37:35 4518356 26456741081 0 (460) 16 (8419398,57816187) 0.165 b4xc5
>>>Rf8-e8 Nc3-e2 Qc7xc5 c2-c4 Ne7xd5 Qd1-d4 Qc5xd4 Nf3xd4 Nd5-c3 Ke1-f1 Nc3xa2
>>>Ra1xa2 Ba6xc4 Nd4-f5 Bc4xa2
>>>depth=16 nps=4454773 ttl=32181820283 egtb=0 aborted=16950269941
>>>(afh=10504865,splits=74699573) time=7224.12s
>>>
>>>        02:22:43 5110928 43766052604 0 (460) 17 (13940238,91131639) 0.609 b4xc5
>>>Rf8-e8 Nc3-e2 Bd6xc5 d5-d6 Bc5xd6 O-O Ne7-f5 Ne2-d4 Qc7-c5 c2-c4 Ra8-d8 Kg1-h1
>>>Bd6-e5 Nf3xe5 Qc5xe5 Qd1-g4
>>>depth=17 nps=4860131 ttl=49287275195 egtb=0 aborted=25516239384
>>>(afh=15928214,splits=109235230) time=10141.14s
>>>
>>>        03:22:12 5265541 63882491771 0 (460) 18 (19968586,133598028) 0.466 b4xc5
>>>Rf8-e8 Nc3-e2 Qc7xc5 c2-c4 Ne7xd5 Qd1-d4 Qc5xd4 Nf3xd4 Nd5-c3 Ke1-f1 Nc3xa2
>>>Ra1xa2 f4-f3 Nd4xf3 Ba6xc4 Ra2-c2 Bc4-d3
>>>depth=18 nps=5272397 ttl=83276304020 egtb=0 aborted=42072953528
>>>(afh=26561999,splits=181805307) time=15794.77s
>>>
>>>        05:47:57 5993449 125127338136 0 (460) 19 (37271002,235161437) 0.293
>>>b4xc5 Rf8-e8 Nc3-e2 Bd6xc5 d5-d6 Bc5xd6 O-O Ne7-f5 Ne2-d4 Nf5xd4 Nf3xd4 Ba6xf1
>>>Kg1xf1 Ra8-d8 Bc1-b2 Bd6xa3 Ba2xf7 Kg8-h8 Ra1xa3
>>>depth=19 nps=6361727 ttl=200869193368 egtb=0 aborted=98884061155
>>>(afh=62657776,splits=376446069) time=31574.63s
>>>
>>>        11:05:57 6941105 277350209487 0 (460) 20 (81637568,463819222) 0.228
>>>b4xc5 Rf8-e8 Nc3-e2 Bd6xc5 d5-d6 Bc5xd6 O-O Ne7-f5 Ne2-d4 Nf5xd4 Nf3xd4 Ba6xf1
>>>Kg1xf1 Ra8-d8 Ra1-b1 Bd6-c5 c2-c3 Bc5-d6 Qd1-f3 Qc7-e7
>>>
>>>In the above output we see that 20 plies get reached after 11 hours. This is not
>>>so important though. 10 plies take 94 seconds to reach. the first 9 ply the node
>>>counts and such are not globally collected. They kept stored local. They do get
>>>counted at 10 ply though. So the numbers at 10 ply and further are correct.
>>>Usually first 9 ply get from hashtable or go so fast that there is no need to
>>>bug 460 processors with the data collection which takes several milliseconds a
>>>ply.
>>>
>>>Now we change the parameter. Last 10 ply get stored only in local hashtable
>>>(including qsearch of course) and the depths above 10 ply get stored in global
>>>hashtable.
>>>
>>>Trivially we just count hashtable. Splitting and such still gets done in the
>>>same manner. Diep doesn't use hashtable properties to split with so many
>>>hashtables (only at small number of cpu's it does), so in reality the results we
>>>get back are a lot worse because of the extra time it takes to split the search
>>>tree to the different processors. Yet even then the next result completely
>>>shocked me too:
>>>
>>>        00:00      0 0 0 (460) 1 (0,0) 2.119 b4xc5 Qc7xc5
>>>        00:00      0 0 0 (460) 2 (0,0) 2.119 b4xc5 Qc7xc5
>>>        00:02      0 0 0 (460) 3 (0,0) 2.237 b4xc5 Ra8-e8 Nc3-e4 Bd6xc5
>>>        00:02      0 0 0 (460) 4 (0,0) 2.237 b4xc5 Ra8-e8
>>>        00:14      0 0 0 (460) 5 (0,0) 1.883 b4xc5 Rf8-e8 Nc3-e2
>>>        00:27      0 0 0 (460) 6 (0,0) 1.846 b4xc5 Ra8-e8
>>>        00:36      0 0 0 (460) 7 (0,0) 2.406 b4xc5 Ne7-f5
>>>        01:04      0 0 0 (460) 8 (0,0) 1.824 b4xc5 Bd6xc5
>>>        03:25      0 0 0 (460) 9 (0,0) 1.970 b4xc5 Rf8-e8 Nc3-e4
>>>        08:38 1222567 633607672 0 (460) 10 (148568,1326106) 0.632 b4xc5 Ra8-e8
>>>Nc3-e4
>>>depth=10 nps=1248100 ttl=725720560 egtb=0 aborted=348576875
>>>(afh=194021,splits=1732467) time=581.46s
>>>
>>>        58:45 1540704 5431214025 0 (460) 11 (889614,7134097) 1.110 b4xc5 Qc7xc5
>>>depth=11 nps=1552287 ttl=5519793255 egtb=0 aborted=2338418994
>>>(afh=940275,splits=7579001) time=3555.91s
>>>
>>>        01:14:21 1567045 6991421375 0 (460) 12 (1415222,10794611) 1.065 b4xc5
>>>depth=12 nps=1592326 ttl=7402982085 egtb=0 aborted=3454686069
>>>(afh=1595556,splits=12242496) time=4649.16s
>>>
>>>        02:51:20 1859415 19115107780 0 (460) 13 (2874460,21811050) 0.444 b4xc5
>>>depth=13 nps=2204845 ttl=24139966284 egtb=0 aborted=10848686727
>>>(afh=4834769,splits=32419200) time=10948.60s
>>>
>>>Without global hashtable you're dead simply! A difference of 5 ply in search
>>>depth. When processors get faster and latencies do not keep up with it, you're
>>>dead simply!
>>>
>>>Even during the tournament where a lot of interesting things happened, when i
>>>saw this result i was completely shocked. I realized that all distributed
>>>project plans i had i had to abandon at once. In fact i *did* have the plan to
>>>make a distributed diep version.
>>>
>>>But the result is just *too* clear. It will never play a realtime game!
>>>5 ply, with all respect, that's even with a branching factor of 3.0 like
>>>
>>>3^5 = 243. Starting at a loss of a factor 243 you're history simply. Trivially
>>>we are talking about an exponential effect here.
>>>
>>>Now this triggered the next experiment i did. Trivially when using more modern
>>>processors than are in this supercomputer thing, you already have a similar
>>>problem like above, but perhaps you can get away by not storing into hashtable
>>>last 2 ply into hashtable.
>>>
>>>Note that the latency this supercomputer has (5.8 microseconds on average to
>>>just get 8 bytes from another processor), it is already dubious to store
>>>everything into hashtables. So let's not store depthleft <= 2 into hashtables
>>>and then search with 460 processors:
>>>
>>>        00:00      0 0 0 (460) 1 (0,0) 2.119 b4xc5 Qc7xc5
>>>        00:00      0 0 0 (460) 2 (0,0) 2.119 b4xc5 Qc7xc5
>>>        00:02      0 0 0 (460) 3 (0,0) 2.237 b4xc5 Ra8-e8 Nc3-e4 Bd6xc5
>>>        00:02      0 0 0 (460) 4 (0,0) 2.237 b4xc5
>>>        00:20      0 0 0 (460) 5 (0,0) 1.883 b4xc5
>>>        00:52      0 0 0 (460) 6 (0,0) 1.846 b4xc5
>>>        01:16      0 0 0 (460) 7 (0,0) 2.406 b4xc5
>>>        01:27      0 0 0 (460) 8 (0,0) 1.824 b4xc5
>>>        05:17      0 0 0 (460) 9 (0,0) 1.970 b4xc5
>>>        05:44 244602 84307066 0 (460) 10 (17774,187986) 1.204 b4xc5 Qc7xc5
>>>depth=10 nps=295935 ttl=107235208 egtb=0 aborted=60058340
>>>(afh=28985,splits=303215) time=362.36s
>>>
>>>        06:51 382970 157779897 0 (460) 11 (48121,494174) 1.095 b4xc5
>>>depth=11 nps=442290 ttl=193360776 egtb=0 aborted=114036575
>>>(afh=69008,splits=680388) time=437.18s
>>>
>>>        07:35 551816 251313751 0 (460) 12 (95107,914561) 1.050 b4xc5 Qc7xc5
>>>depth=12 nps=749278 ttl=452773756 egtb=0 aborted=287457319
>>>(afh=169753,splits=1603527) time=604.28s
>>>
>>>        11:33 1344027 932002639 0 (460) 13 (351281,2939548) 0.444 b4xc5 Qc7xc5
>>>depth=13 nps=3324907 ttl=3753055869 egtb=0 aborted=2206079184
>>>(afh=1475329,splits=8898360) time=1128.77s
>>>
>>>        21:06 3722247 4714375656 0 (460) 14 (1891633,11348602) 0.117 b4xc5
>>>        43:53 5137463 13527660552 0 (460) 14 (5270814,30097544) 0.182 b4-b5
>>>depth=14 nps=4905888 ttl=14334219942 egtb=0 aborted=8253355779
>>>(afh=5601330,splits=32690541) time=2921.84s
>>>
>>>        57:16 5159575 17732532620 0 (460) 15 (6859755,40484655) 0.351 b4-b5
>>>depth=15 nps=4679243 ttl=19836809206 egtb=0 aborted=11352746534
>>>(afh=7505339,splits=45855978) time=4239.32s
>>>
>>>        01:22:45 4792391 23797436449 0 (460) 16 (8718381,52815992) 0.160 b4-b5
>>>        01:29:42 5004482 26937627088 0 (460) 16 (9865863,59231902) 0.165 b4xc5
>>>depth=16 nps=5022682 ttl=33629317406 egtb=0 aborted=18947935034
>>>(afh=12566091,splits=78163568) time=6695.49s
>>>
>>>        02:31:09 5371057 48713667924 0 (460) 17 (17430262,102209832) 0.615 b4xc5
>>>depth=17 nps=5133183 ttl=54610551153 egtb=0 aborted=30461988599
>>>(afh=19878164,splits=120558181) time=10638.73s
>>>
>>>        03:38:29 5527223 72457973813 0 (460) 18 (25186378,148399656) 0.466 b4xc5
>>>depth=18 nps=5556667 ttl=92296420576 egtb=0 aborted=50576134199
>>>(afh=33264389,splits=201240456) time=16610.03s
>>>
>>>        06:20:37 6535370 149251520232 0 (460) 19 (47705397,264107777) 0.293
>>>b4xc5
>>>depth=19 nps=6963743 ttl=247859477324 egtb=0 aborted=131087487407
>>>(afh=83002586,splits=441367183) time=35592.85s
>>>
>>>        12:16:54 7458129 329754401671 0 (460) 20 (104829805,537922298) 0.228
>>>b4xc5
>>>
>>>In the long run it is amazingly scaling just as well, if not better than the
>>>original run. However, the problem is very easy to see. Just look at the time of
>>>10 ply. From 1:34 it goes to 5:44.
>>>
>>>The problem is therefore very trivial. Getting the cpu's to work quickly is a
>>>major problem when you do not store last few plies into global hashtable.
>>>
>>>The search tree is simply getting more unpredictable and more unbalanced, which
>>>is a real major problem. So searching at big clusters with fast opteron cpu's
>>>for example, it will be real hard.
>>>
>>>I've been working for years in the parallel field and already achieved more than
>>>anyone before in parallel search just looking to how diep scales at this machine
>>>when compared to the dinosaurs before me.
>>>
>>>But when the trees get even more unbalanced than they nowadays already are
>>>because of nullmove and a good working global hashtable, then problems really
>>>get massive for future game tree researchers at massive parallel systems.
>>>
>>>Not so much the speedup after an hour or 10 is the problem, but basically
>>>getting a realtime scaling will be the real problem.
>>>
>>>To me it looks like a very difficult task. The problem lies in the first so many
>>>plies and solutions are not trivial. solutions will be like starting first cpu's
>>>at nearby cpu's. Then cpu's that are at 1 router distance and like that.
>>>
>>>Making a game tree searching program like that is *really* complicated.
>>>
>>>Of course i'm speaking about 100+ cpu's here. Up to a cpu or 64 i see so far no
>>>problems.
>>>
>>>So from my viewpoint the chessbrain project is doomed to play games where they
>>>get a better speedup than a quad xeon/opteron will deliver.
>>>
>>>In fact the results have shocked me that much that i have abandonned all project
>>>plans to convert diep to MPI too. As it seems now, diep will remain shared
>>>memory forever. Even though the conversion to MPI is pretty trivial from the
>>>current datastructure, i simply see no point in doing it!
>>>
>>>In a few years when cpu's are a lot faster than they are now, getting to work
>>>fast cpu's with x microsecond latencies is asking for trouble!
>>>
>>>We are however in the sure know that in a few years when cpu's are X times
>>>faster than they are now, that the network latencies from the fastest network
>>>cards (myrilnet) will still be around 5 us (right now 10 us) to get a single
>>>position.
>>>
>>>Not to mention the even bigger problem and that's effeciently splitting all
>>>cpu's and getting them to work in such a way that your branching factor doesn't
>>>suffer.
>>>
>>>Search of chessprograms in 2003 are having a too good branching factor simply to
>>>let a huge cluster (all supercomputers are basically clusters when having a huge
>>>number of cpu's; even the T3E's latencies with opterons get expressed in
>>>microseconds there) compensate for that by raw speed.
>>>
>>>Additionally those clusters/supercomputers carry processors which are x times
>>>slower than the fastest pc processors. So you always start at another
>>>disadvantage, which some years ago was simply not there.
>>>
>>>>On December 21, 2003 at 20:51:19, Bob Durrett wrote:
>>>>
>>>>i am writing a huge technical report for the government about diep at the
>>>>supercomputer. because 512 processors is so much faster than 1, it takes a huge
>>>>time to calculate the speedups.
>>>>
>>>>practical spoken it was worst case about 2 ply deeper in middlegame roughly than
>>>>a fast pc would have given me (dual opteron). Considering 1 cpu was already 10
>>>>times slower practically than a pc processor is, that is pretty good.
>>>>
>>>>Some positions it performed real well. Peek was 9.9 mln nps, for diep this is
>>>>very good. a single processor gives in opening 22k nps and it scaled about
>>>>40-60% at the supercomputer sequential (so practically losing from that 22k nps
>>>>about 11k nps).
>>>>
>>>>And that at tournament time control, which nowadays is a lot quicker than some
>>>>years ago, i do not consider a bad thing.
>>>>
>>>>Zugzwang and similar programs like cilkchess ran at similar machines and had
>>>>despite having a smaller evaluation function, a lot less nps at such machines.
>>>>
>>>>Zugzwang at a faster latency T3E with 512 alpha's 500Mhz (that's what memory
>>>>tells me, all these guys are good in NOT reacting onto emails i ship them)
>>>>peeked around 2 million nodes a second, i remember it very well.
>>>>
>>>>Diep's logfiles just like the huge report i'm writing will be public available.
>>>>That means that everyone who asks after them will get it. The logfiles you can
>>>>already get if you want to.
>>>>
>>>>The testresults i had will be in the report and the entire outputs are publicly
>>>>available too.
>>>>
>>>>The thing missing is single cpu runs to recalculate testresults which were
>>>>obtained by overnight runs.
>>>>
>>>>One of my conclusions will surprise some researchers (not all of them) and the
>>>>government will dislike it. So it will be carefully written down.
>>>>
>>>>Best regards,
>>>>Vincent
>>>>
>>>>> Hi Vincent.
>>>>>
>>>>>Have you had a chance to analyze the GRAZ performance data for your
>>>>>chess-playing program yet?
>>>>>
>>>>>What can you tell us about your technical findings?  Did you achieve the goals
>>>>>you had set for the GRAZ event?
>>>>>
>>>>>Someone said they thought you did not have the opportunity to test your software
>>>>>on the full-up computer hardware prior to the GRAZ event.  If so, did that have
>>>>>an impact?
>>>>>
>>>>>Your findings should shed light on the technical aspects of running chess
>>>>>programs on very large multiprocessor computers [>>8 processors].
>>>>>
>>>>>Bob D.
>>
>>Thanks, Vincent, for responding to my request for your comments.
>>
>>It is interesting to hear that you have already reached a number of strong
>>perceptions and conclusions.  Not everybody will see the technical issues the
>>same way you do and not everybody will share your perceptions and conclusions,
>>but I hope you can deal with their objections [diplomatic or otherwise] in an
>>objective, impartial and unbiased manner.
>
>>Ideally, you will receive inputs from this forum which will help you to better
>
>No, this forum provides 0.
>
>litterary 0.
>
>the only good statement i saw was that 21 november 2003 statement by Bob about
>that he retires :)

I think that he was talking about a case that Crafty is losing against Junior
but Crafty won the match 6-5 with 8 draws.

It is of course not something that he has to do and even in case of a loss he
has the right to change his mind about it.

Note that I almost predicted the right result(I predicted 6-4 or 6-3 for Crafty
and it was 6-5).

>
>At the world championship many were cheering about this :)
>
>>plan your future activities.  Hyatt, in particular, IS capable of helping you to
>>see things which you might not have considered previously, although Bob does
>
>It's like a stone where certain info is wrongly inputted in him in the 80s and
>the next 30 years he's quoting that information in a subjective manner.
>
>Technically he is also outdated everywhere with regards to parallel search.
>
>Every truth that was clearly proven by several persons he gets back on and you
>can start new again. This is a well known tactic by lawyers, but in case of
>hyatt it must be more than a lawyer tactic. It's his way of life.
>
>Because he never shows up at big tournaments (either in europe or asia) each day
>his knowledge gets more antique and each new year new nonsense gets added to his
>point.
>
>I didn't get into the technical latency fight with hyatt which he raised again
>in an email a bit further, simply because it has been already so clear proven by
>many people, among which software written by me and dieter buerssner and it is
>well known in academic/commercial world that there is something called one way
>ping pong latency tests, which are the opposite of bandwidth latency.
>
>his old gigabit network for example is not having a random latency of 0.5 usec.
>In fact you cannot even put that many bytes through it :)
>
>In fact the most expensive interconnects at supercomputers do not even deliver a
>bandwidth latency of 0.5 usec a processor at the same time.
>
>The latency of 64 bits gigabit network is around 20 microseconds to get 8 bytes
>from hashtable at a random processor to you.
>
>the latency of 64 bits myrilnet is 10 microseconds to get 8 bytes. Trivially the
>one way ping pong latency is 5 microseconds.
>
>If old persons like hyatt deny such truths here time and again, what more can i
>say? It's better to ignore this forum when he again is rambling such nonsense
>that his $150 gigabit network card (price nowadays) is faster latency than a new
>$1500 myrilnet work card. Not to mention even more expensive interconnects which
>are put straight onto special supercomputer mainboards (with ECC of course).
>Note that even the cheap $10k router used in Bob's 8 node cluster is eating 50
>ns from bandwidth. So adding the latency of the memory: 400 ns + 2x50 for the
>router you're already starting at 0.5 usec and still must count the pci bus +
>network card latency to that :)
>
>Yet he claims 0.5 usec, see how simple it is to refute the nonsense he writes
>for those who know technical something in that area?
>
>Coming back again onto simply measurable truths which get rewritten time and
>again each time, the hyatt person simply should stick to his word and retire
>when losing that match :)
>
>>seem to have a problem with your style of presentation and his style of
>>presentation can sometimes be a wee bit abrasive at times.  [ : ) ]  Please do
>
>replace abrasive by 'outdated knowledge'. It is logical that knowledge gets
>outdated when someone doesn't show up at computerchess tournaments. Most
>programmers must feel in person reality in such tournaments before realizing
>truth.
>
>Uri Blass is a good example. He started also writing the biggest nonsense here
>but i guess he'll learn a lot in 4-14 july 2004. Especially that one should not
>open 1.h3 :)

I do not think that I will learn it.
I expect to lose because of different reasons and the main reason is that I have
inferior engine even on one processor and I do not plan to make my engine
parallel in the near future.

I believe that it is possible to avoid opening traps by a small book that avoids
the main lines.

I do not claim that it is possible to avoid losses when you have an inferior
engine.

>
>Hyatt is also in the Uri Blass level.

Thanks but I have a lot to learn in order to be in Hyatt's level.
There are some good things that I do in my program that crafty does not do but
today movei is weaker than Crafty even if we force Crafty to use only one
processor.

 The difference is that he doesn't join
>tournaments in 2004 other than some tournament where no commercial programmer is
>chatting with you.



>
>Nothing makes up for chats in a world championship during rounds, between rounds
>and in the restaurant in the evening.

I do not see that it helped you to get to the top level.
diep participate for years in tournament and I do not see it close to topping
the ssdf list.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.