Author: Robert Hyatt
Date: 17:33:02 06/06/02
Go up one level in this thread
On June 06, 2002 at 19:54:24, Vincent Diepeveen wrote: >On June 06, 2002 at 10:24:27, Robert Hyatt wrote: > >There is a huge difference between the test on this >processor, because running at 2 processors it was very >slow from hardware viewpoint. Like 1.5 was mentionned >then too. OK... so what? It was very fast on the single-cpu test... > >Also we talk about a very old version of crafty here compared >to the crafty that's existing today. I remember that you had >way less king safety and some other scans in these crafties >and you did do less here and there. it was the last 16.x version I believe. I am now on 18.15, but king safety hasn't been greatly modified during that span... > >In short about all was allowed to get more nps, whereas >right now the 'default' assembly used for K7/P4 is fucking >slow beginners assembly. This was of course not put to >'slow' at this alpha test, as there were no 'specint' >limits. I don't know what you mean. I know for 100% certainty that Tim didn't modify the source code. He was running gnuchess on ICC one night and we noticed an impossible NPS. I asked if he would try crafty and he said sure. I sent him the source, and a we had benchmark numbers about 10 minutes later. He then ran WAC (one minute/pos) and sent me the results which I include here: 1 cpu 21264/600mhz: total positions searched.......... 300 number right...................... 300 number wrong...................... 0 percentage right.................. 100 percentage wrong.................. 0 total nodes searched.............. 236973211.0 average search depth.............. 4.5 nodes per second.................. 783641 4 cpus quad xeon 550: total positions searched.......... 300 number right...................... 299 number wrong...................... 1 percentage right.................. 99 percentage wrong.................. 0 total nodes searched.............. 280348143.0 average search depth.............. 4.5 nodes per second.................. 722788 2 cpus, 21264/600mhz: total positions searched.......... 300 number right...................... 300 number wrong...................... 0 percentage right.................. 100 percentage wrong.................. 0 total nodes searched.............. 330905102.0 average search depth.............. 4.5 nodes per second.................. 1266767 Not bad. I had remembered 1M and 1.5M. I just verified that those numbers were produced on a 667mhz machine instead, at Compaq. A slightly faster version of Tim's machine. And right in line with the 1.5M single-cpu speed of Mckinley at 1ghz. > >It was *not* a production alpha ever, the test was done long >before this type of alpha was put on the market, so we don't >know whether you can buy this alpha in the shop. I have no idea what you are talking about. I had exactly that machine here in my lab, for 6+ months. (single-cpu version). It ran at 667 mhz and produced 1M nodes per second. I didn't do much with chess on it as it was here to do some work for someone up the street from here. But it was (and is) available for purchase. I had that machine over a year ago. It was not a "black box" but had a name plate on the front and could be ordered from whomever owned the DEC stuff at that point in time. Someone up in the medical school bought the thing, left it here for me to work on some code for him, and that was that... > >There is another list of things wrong. > >For example if it was such a slow processor, why only getting >1.5 hardware speedup out of 2 processors? Because the hash table used locks. And the locks were very bad on the alpha. We later went to the "lockless hash table" that I now use. I never had access to either machine (Tim's or the one in the medical school here) to run WAC again after that was fixed. The out-of-order memory writes on the alpha require a "barrier" prior to clearing the lock, and the lock/unlock themselves are also very expensive. Both together (lock/barrier) really produced a bottleneck. No mystery at all... I think we mentioned this in the paper we wrote for ICCA which ought to appear in the next issue. > >That means a cheap dual K7 getting 2 million nodes a second is still >faster than this 1.5 million nodes a second dual alpha. I have not yet seen a dual K7 get 2M nodes per second with Crafty... > >Note that we compare a todays crafty version with that special >old thing then. Also we assume then beginners assembly for the >current dual K7 crafty, versus optimal defines for the alpha. The version Tim had was not that old. The version I ran on the 667 mhz machine was even newer, in the 17.x group... > >That's not a very fair compare. Seems perfectly fair to me...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.