Author: Vincent Diepeveen
Date: 04:21:25 02/17/04
Go up one level in this thread
On February 17, 2004 at 06:35:10, Uri Blass wrote: >On February 17, 2004 at 06:19:00, Vincent Diepeveen wrote: > >>On February 16, 2004 at 22:09:28, Robert Hyatt wrote: >> >>>On February 16, 2004 at 18:25:19, Vincent Diepeveen wrote: >>> >>>>On February 16, 2004 at 13:28:21, Robert Hyatt wrote: >>>> >>>>>On February 16, 2004 at 12:08:28, Vincent Diepeveen wrote: >>>>> >>>>>>On February 16, 2004 at 12:02:16, Robert Hyatt wrote: >>>>>> >>>>>>>On February 16, 2004 at 11:30:57, Jorge Pichard wrote: >>>>>>> >>>>>>>>I still don't understand why Fritz nor Shredder have not been able to get an AMD >>>>>>>>sponsor, since 95% of the times it is sponsored by company that runs Intel >>>>>>>>inside. They need to get a different sponsor in order to beat Hydra in the World >>>>>>>>Championship. >>>>>>>> >>>>>>>>Hydra gets effectively around 4 million nodes a second >>>>>>>> >>>>>>>>I am very sure that a Quad opteron for a software program is >>>>>>>>faster than 4 fpga cards 30Mhz are. >>>>>>> >>>>>>>quad opteron box is NUMA. There are some issues there that have to be addressed >>>>>>>by anyone using such a box. Just taking a pure SMP program and dropping it in >>>>>>>may not produce such good results. Dual opterons are a bit easier to use. >>>>>> >>>>>>It works SMP great too. The latency when using it SMP is still faster than quad >>>>>>xeon chipset can deliver to you. >>>>> >>>>>No it isn't. A single cpu has a latency in the 60ns range. Dual is 60 for >>>>>local, 120 for remote. Go to 4-way and you get 60 for local, 120 for two of the >>>>>other banks, 180 for the last bank. >>>> >>>> >>>>>That is for a single memory reference. Assuming a TLB hit. IF you get a TLB >>>>>miss you _die_ just as you do anywhere, except that it is possible that the >>>> >>>>I didn't know Crafty nowadays was streaming sequential and that you only are >>>>multiplying nowadays matrice. >>> >>>I didn't know that _all_ you do is random hash probes... In my program, I only >>>do one call to HashProbe() per node. I do a _lot_ of other stuff per node as >> >>So you are denying that each probe you do to hashtable is random and you're >>saying that you are not using Zobrist in Crafty? >> >>Are you or are you not? >> >>>well, from generating moves, to ordering moves, to using the Swap() (SEE) code, >>>to evaluating positions, and so forth.. Most of those are not going to blow the >>>TLB. Which means that for Crafty, memory access time is going to be almost >>>exactly memory latency time. Only one reference per node requires the 3-4 >>>access virtual-to-real translation overhead. Out of _thousands_... >> >> >> >>> >>> >>>> >>>>>memory map tables are in remote memory as well, which means that your memory >>>>>access time (not latency) turns into 3x or 4x what it should. Opteron uses a >>>>>3-level map because of the 48 bit virtual address space. That means you do >>>>>three extra memory reads when you suffer a TLB miss. >>>> >>>>>My dual xeon has 150ns latency. TLB misses turn that into 450. The Opteron has >>>> >>>>>much more variability. 60ns on a TLB hit, up to 720ns for a TLB miss where the >>>>>page tables are in remote memory. >>>> >>>>>>Even without PGO and using old GCC version SMP version from diep gets a lot of >>>>>>nps at that box slightly less than it gets at a 8 processor Xeon. the numa >>>>>>version a lot less (not sharing evaluation tables nor pawn tables and the numa >>>>>>version tested wasn't sharing qsearch hashtables either). >>>>>> >>>>>>See www.aceshardware.com for diep SMP tests at quad opteron boxes. >>>>> >>>>>Don't need to. I have my own quad opteron numbers with things done right... >>>>>Whether you get good numbers with no work or not, you get _better_ numbers when >>>>>memory is done right. And it can be _significantly_ better. From experience. >>>> >>>>Well multiprocessing is way faster of course than multithreading at such >>>>machines, that includes 2-4 itaniums too. >>> >>> >>>No it isn't. You just don't know how to do multi-threading, apparently. I do >> >>You trivially have no idea how advanced that microprocessors are nowadays. >>You're still toying with the 68000 designs i bet. >> >>By the way the 68000 has been named like that because it uses 68000 gates. >> >>Nowadays processors have tens of millions of transistors and work very complex >>and you have not even a remote clue on how they work. >> >>>and it is working just fine. And is actually an efficient way to do things as >>>Eugene has told you many times. Shared egtb buffers is one reason. There is no >>>reason for threads to be bad... >> >>So you do not know how to share memory when running multiprocessor? >> >>> >>> >>>> >>>>Your thing is continuesly busy with cache coherency, multiprocessor applications >>>>don't suffer from that of course. DIEP is multiprocessor. >>> >>>My thing is _not_ continually busy with cache coherency. You just don't >>>understand how "my thing" works, apparently. >>>Perhaps you should look at your results, when trying to figure out whether what >>>you are doing is better than another approach. My approach seems to be doing >>>just fine, based on recent results and performance measurements... >> >>I see that Crafty will never end above me in a world champs because you are >>fearing to join there as you would get crushed like an ant there. > >I see that you will never learn that people who do not join world championships >may do it for other reasons than fear. > >There are other tournaments like the CCT and Crafty finished above some >commercial programs like Hiarcs or Ruffian Or Rebel. > >Time control is different but generally the difference in elo between 45+10 and >tournament time control is not so big to say that crafty has no chance against >the commercial programs. > >Uri In short you are claiming that in world champs 2004 you will get won positions against junior and others too which will end only in a draw because of some endgame bug. I'll tell you, that *won't* happen. You will get crushed like an ANT there too. No beginner openings will be played there.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.