Author: Robert Hyatt
Date: 10:28:21 02/16/04
Go up one level in this thread
On February 16, 2004 at 12:08:28, Vincent Diepeveen wrote: >On February 16, 2004 at 12:02:16, Robert Hyatt wrote: > >>On February 16, 2004 at 11:30:57, Jorge Pichard wrote: >> >>>I still don't understand why Fritz nor Shredder have not been able to get an AMD >>>sponsor, since 95% of the times it is sponsored by company that runs Intel >>>inside. They need to get a different sponsor in order to beat Hydra in the World >>>Championship. >>> >>>Hydra gets effectively around 4 million nodes a second >>> >>>I am very sure that a Quad opteron for a software program is >>>faster than 4 fpga cards 30Mhz are. >> >>quad opteron box is NUMA. There are some issues there that have to be addressed >>by anyone using such a box. Just taking a pure SMP program and dropping it in >>may not produce such good results. Dual opterons are a bit easier to use. > >It works SMP great too. The latency when using it SMP is still faster than quad >xeon chipset can deliver to you. No it isn't. A single cpu has a latency in the 60ns range. Dual is 60 for local, 120 for remote. Go to 4-way and you get 60 for local, 120 for two of the other banks, 180 for the last bank. That is for a single memory reference. Assuming a TLB hit. IF you get a TLB miss you _die_ just as you do anywhere, except that it is possible that the memory map tables are in remote memory as well, which means that your memory access time (not latency) turns into 3x or 4x what it should. Opteron uses a 3-level map because of the 48 bit virtual address space. That means you do three extra memory reads when you suffer a TLB miss. My dual xeon has 150ns latency. TLB misses turn that into 450. The Opteron has much more variability. 60ns on a TLB hit, up to 720ns for a TLB miss where the page tables are in remote memory. > >Even without PGO and using old GCC version SMP version from diep gets a lot of >nps at that box slightly less than it gets at a 8 processor Xeon. the numa >version a lot less (not sharing evaluation tables nor pawn tables and the numa >version tested wasn't sharing qsearch hashtables either). > >See www.aceshardware.com for diep SMP tests at quad opteron boxes. Don't need to. I have my own quad opteron numbers with things done right... Whether you get good numbers with no work or not, you get _better_ numbers when memory is done right. And it can be _significantly_ better. From experience. > >>Another issue is that AMD will likely want to see real 64 bit applications. >>That is why they developed an interest in Crafty, in fact, because it really >>needed the 64 bit internal stuff the opteron offers. Fritz, et al don't need >>nor will they use this particular part of the opteron... >> >> >>> >>>Fritz and Shredder run in Paderborn on an identically constructed Transtec >>>diagram workstation with two Intel each Xeon processors with 3,06 Ghz and 2 >>>gigabyte memory. Deep Fritz will also count over 1 gigabyte Hashtabellen and >>>with 2 to 2.3 million position/second for instance a search depth on 14 to 16 >>>sections will reach, in the final game by means of 20 sections, strongly >>>dependent on position and material. >>> >>>When ordered a Quad Opteron cost perhaps $45k and fpga cards cost only $3000 a >>>card and a 4 node cluster Quad Xeon 3.06Ghz costs less than $45k. >>> >>>Here are some comparison of a Dual Opteron versus a Dual Xeon: >>>http://www.gamepc.com/labs/view_content.asp?id=opt248vsxeon32a&page=5
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.