Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hyper-Threading Technology from Intel-to Hype or Not to Hype?

Author: Vincent Diepeveen

Date: 08:38:26 03/06/03

Go up one level in this thread


On March 05, 2003 at 18:08:45, Robert Hyatt wrote:

you read postings of me seemingly, but i am only owning a dual 1.6ghz K7 with a
slow 133Mhz bus, and because crafty is a cache eather a few guys which are at
150Mhz or 166Mhz bus with a dual XP, they rock of course with crafty on that
machine.

See postings of several of them here at CCC. Just look around. I remember one
from a few weeks ago even which i noticed.

note that a dual XP 2400 already when put to 2.1ghz (so a very little bit
overclocked the DDR ram, not so much the processor) is getting
2.2MLN nodes a second hands down.

if you would modify your thing such that it has a 128KB hashtable in total where
you write in last 2 ply or so including qsearch, and in the global big hashtable
you write >= 2 ply depthleft, then this small hashtable would get into the L2
cache of your processors.

This will help both K7 and P4 a lot in NPS for crafty and make it less of a RAM
to cache eater, the K7 (perhaps P4 too i do not know, P3 doesn't have it though)
can benefit a lot there because it can then use the alpha 21264 feature where it
can read from the L2 cache of the other guy while reading. Majority will be
reads of course. > 50% of the cases will be reads.

Crafty is nowadays that fast that priority is avoiding the slow lookups to the
hashtable. A hashtable that fits within L2 cache is just the way to go.

Note that for the transpositiontable it is possible to do like the pro's do and
put each read to it at the start of a cache line. This is possible to do in C,
no need for assembly.

Of course you would never invent that yourself but it will make crafty less of a
RAM speed eater and deliver more IPC. In fact so much IPC you will get then that
the future is bright and clear for crafty then with regards to NPS.

Getting from the current 2.1MLN a second to 3MLN a second is no problem. In fact
with a very tiny last ply hashtable it is possible to even get more efficient as
you also hash qsearch.

A lookup to the L2 cache is very cheap compared to an evaluation and the
majority of transpositions is always within a couple of hundreds of thousands of
nodes.




>On March 05, 2003 at 15:25:20, Vincent Diepeveen wrote:
>
>>On March 05, 2003 at 11:34:39, Robert Hyatt wrote:
>>
>>see the posted speeds of crafty at the very cheap dual K7 XPs with faster FSB
>>and then compare that with your own speeds. Crafty is a lot faster on those
>>machines than yours. That despite crafty is a cache eater (among the
>>chessprograms, not among specint).
>
>I will ask again, for a number > 2.16M nodes per second with Crafty.  I have no
>doubt some box can produce that number, including the 3.06ghz xeons.  But to
>date
>I have not seen one.
>
>I do mean a real number, not a "if I pushed the clock to 2.3ghz it would do
>this..."
>type number...
>
>We just received our dual 3.06 boxes today.  Unfortunately they came with an
>onboard
>SCSI controller that supports hardware raid and we had to zap everything to
>"unraid" the
>disks.  Hopefully I can post some 3.06SMT results tomorrow.  These are 3.06
>xeons (dual)
>with 533mhz FSB so it will be interesting to see how they compare to my 2.8ghz
>with
>400mhz FSB.
>
>>
>>>On March 05, 2003 at 10:21:32, Vincent Diepeveen wrote:
>>>
>>>>On March 04, 2003 at 22:47:04, Robert Hyatt wrote:
>>>>
>>>>>On March 04, 2003 at 17:39:42, Vincent Diepeveen wrote:
>>>>>
>>>>>>On March 04, 2003 at 16:32:33, Jay-R Delacruz wrote:
>>>>>>
>>>>>>>Do the deep versions of Fritz, Junior and Shredder support hyper-thread? Can
>>>>>>>someone please tell me before upgrading my PC to try the deep versions?
>>>>>>
>>>>>>I just read email from Frans Morsch. DeepFritz7 gets 5-10% speedup by
>>>>>>hyperthreading.
>>>>>>
>>>>>>Shredder gets more speedup in nodes a second than that, but it gets no speedup
>>>>>>from it as it gets SMP already a far smaller speedup (1.5 or so), so it is
>>>>>>smarter to turn SMT/HT off for it. perhaps shredder8 will fix this.
>>>>>>
>>>>>>For diep it speeds me up about 11% in NPS but i cannot garantuee that at a 4
>>>>>>processor it will give a positive speedup.
>>>>>>
>>>>>>When running 2 processes at a P4 at 3.06ghz it will give for sure some speedup
>>>>>>because it goes from 100k nps to 120k nps. Nearly 20% speedup it gets with it
>>>>>>(18.6 or something) which gives a positive speedup also in depth.
>>>>>>
>>>>>>For deepjunior we know that it already works bad at 8 processor Xeon 1.6Ghz
>>>>>>versus 4 processor Xeon 1.9Ghz, so i *assume* for now that SMT/HT will not give
>>>>>>it much benefit for it at all, but perhaps Amir or Shay wants to give a
>>>>>>statement regarding this themselves.
>>>>>>
>>>>>>We talk of course about the SMT/HT from Xeon processors up to 2.8Ghz now for
>>>>>>those which have it enabled. For the P4 3.06Ghz and also Xeons of that and above
>>>>>>things are a different matter.
>>>>>
>>>>>You keep saying that.  It continues to be _wrong_.  The 2.8 xeon has the
>>>>>_exact_ same cpu core (and SMT) that the 3.06 xeon and PIV has.  And when I
>>>>>say _exactly_ I mean _exactly_.  This is _directly_ from Intel... for the
>>>>>record.
>>>>
>>>>try some better source instead of the marketing department try some hardware
>>>>experts. for example at: http://www.realworldtech.com/index.cfm
>>>
>>>
>>>I don't use "marketing types".  And I can send you some dual 3.06 xeon test
>>>results that
>>>mirror my dual 2.8's _exactly_ in terms of the 20% to 30% raw NPS figures.  I
>>>sent my
>>>"worst case positions" to someone with one of these machines and he got the
>>>_same_
>>>20% improvement at 3.06 that I got with my 2.8's.
>>>
>>>Your data is simply wrong.  The xeon core has _not_ changed from 2.8ghz to 3.06
>>>ghz,
>>>and I have no idea why you want to supply your "disinformation" that it has.
>>>
>>>We have four of these on the way (dual 3.06 dell 650s) for faculty.  They have
>>>shipped
>>>(2/28) so they should be here any time.  I'll run the tests and post the results
>>>to further
>>>debunk this "myth" that 3.06's are different...
>>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>Best regards,
>>>>>>Vincent



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.