Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Hyper-Threading Technology from Intel-to Hype or Not to Hype?

Author: Robert Hyatt

Date: 13:36:15 03/06/03

On March 06, 2003 at 11:38:26, Vincent Diepeveen wrote:

>On March 05, 2003 at 18:08:45, Robert Hyatt wrote:
>
>you read postings of me seemingly, but i am only owning a dual 1.6ghz K7 with a
>slow 133Mhz bus, and because crafty is a cache eather a few guys which are at
>150Mhz or 166Mhz bus with a dual XP, they rock of course with crafty on that
>machine.
>
>See postings of several of them here at CCC. Just look around. I remember one
>from a few weeks ago even which i noticed.
>
>note that a dual XP 2400 already when put to 2.1ghz (so a very little bit
>overclocked the DDR ram, not so much the processor) is getting
>2.2MLN nodes a second hands down.

I'm waiting for someone to post a real result here from a _stock_ CPU.  I don't
go for
overclocking and I don't consider overclocked numbers.

And this NPS is on the bench command, with no crafty.rc file (default hash and
everything).


>
>if you would modify your thing such that it has a 128KB hashtable in total where
>you write in last 2 ply or so including qsearch, and in the global big hashtable
>you write >= 2 ply depthleft, then this small hashtable would get into the L2
>cache of your processors.

What's the point?   We all run the "bench" command the same way, using the
default hash
table size.



>
>This will help both K7 and P4 a lot in NPS for crafty and make it less of a RAM
>to cache eater, the K7 (perhaps P4 too i do not know, P3 doesn't have it though)
>can benefit a lot there because it can then use the alpha 21264 feature where it
>can read from the L2 cache of the other guy while reading. Majority will be
>reads of course. > 50% of the cases will be reads.
>
>Crafty is nowadays that fast that priority is avoiding the slow lookups to the
>hashtable. A hashtable that fits within L2 cache is just the way to go.
>
>Note that for the transpositiontable it is possible to do like the pro's do and
>put each read to it at the start of a cache line. This is possible to do in C,
>no need for assembly.

Only if you make the entry a power of two.   Mine is close now in that a triplet
is
48 consecutive bytes.  But I don't consider hashing a big problem.  I could turn
it totally
off to see what it does to NPS of course, but I don't really care.

>
>Of course you would never invent that yourself but it will make crafty less of a
>RAM speed eater and deliver more IPC. In fact so much IPC you will get then that
>the future is bright and clear for crafty then with regards to NPS.
>
>Getting from the current 2.1MLN a second to 3MLN a second is no problem. In fact
>with a very tiny last ply hashtable it is possible to even get more efficient as
>you also hash qsearch.

That is total baloney.  Since > 50% of my total time is spent in Evaluate()
there is no way
to get from 2.1 to 3M by fiddling with the hash probe stuff.  Here is the most
recent
profile output:
  2.37     34.50     0.96   808908     0.00     0.00  HashStore

  2.03     35.44     0.94   944844     0.00     0.00  HashProbe

That first column is % time.  So my hashing code is 4.37% of the total search
time,
how are you going to reduce that and make the program go 50% faster to reach 3M?

The obvious answer is "you aren't."

And I don't see why you don't see the flaw in your logic.

>
>A lookup to the L2 cache is very cheap compared to an evaluation and the
>majority of transpositions is always within a couple of hundreds of thousands of
>nodes.

No lookup at all would make me 4.7% faster, but make the size of the tree _way_
larger.

This is the kind of illogical reasoning that drives people mad.


>
>
>
>
>>On March 05, 2003 at 15:25:20, Vincent Diepeveen wrote:
>>
>>>On March 05, 2003 at 11:34:39, Robert Hyatt wrote:
>>>
>>>see the posted speeds of crafty at the very cheap dual K7 XPs with faster FSB
>>>and then compare that with your own speeds. Crafty is a lot faster on those
>>>machines than yours. That despite crafty is a cache eater (among the
>>>chessprograms, not among specint).
>>
>>I will ask again, for a number > 2.16M nodes per second with Crafty.  I have no
>>doubt some box can produce that number, including the 3.06ghz xeons.  But to
>>date
>>I have not seen one.
>>
>>I do mean a real number, not a "if I pushed the clock to 2.3ghz it would do
>>this..."
>>type number...
>>
>>We just received our dual 3.06 boxes today.  Unfortunately they came with an
>>onboard
>>SCSI controller that supports hardware raid and we had to zap everything to
>>"unraid" the
>>disks.  Hopefully I can post some 3.06SMT results tomorrow.  These are 3.06
>>xeons (dual)
>>with 533mhz FSB so it will be interesting to see how they compare to my 2.8ghz
>>with
>>400mhz FSB.
>>
>>>
>>>>On March 05, 2003 at 10:21:32, Vincent Diepeveen wrote:
>>>>
>>>>>On March 04, 2003 at 22:47:04, Robert Hyatt wrote:
>>>>>
>>>>>>On March 04, 2003 at 17:39:42, Vincent Diepeveen wrote:
>>>>>>
>>>>>>>On March 04, 2003 at 16:32:33, Jay-R Delacruz wrote:
>>>>>>>
>>>>>>>>Do the deep versions of Fritz, Junior and Shredder support hyper-thread? Can
>>>>>>>>someone please tell me before upgrading my PC to try the deep versions?
>>>>>>>
>>>>>>>I just read email from Frans Morsch. DeepFritz7 gets 5-10% speedup by
>>>>>>>hyperthreading.
>>>>>>>
>>>>>>>Shredder gets more speedup in nodes a second than that, but it gets no speedup
>>>>>>>from it as it gets SMP already a far smaller speedup (1.5 or so), so it is
>>>>>>>smarter to turn SMT/HT off for it. perhaps shredder8 will fix this.
>>>>>>>
>>>>>>>For diep it speeds me up about 11% in NPS but i cannot garantuee that at a 4
>>>>>>>processor it will give a positive speedup.
>>>>>>>
>>>>>>>When running 2 processes at a P4 at 3.06ghz it will give for sure some speedup
>>>>>>>because it goes from 100k nps to 120k nps. Nearly 20% speedup it gets with it
>>>>>>>(18.6 or something) which gives a positive speedup also in depth.
>>>>>>>
>>>>>>>For deepjunior we know that it already works bad at 8 processor Xeon 1.6Ghz
>>>>>>>versus 4 processor Xeon 1.9Ghz, so i *assume* for now that SMT/HT will not give
>>>>>>>it much benefit for it at all, but perhaps Amir or Shay wants to give a
>>>>>>>statement regarding this themselves.
>>>>>>>
>>>>>>>We talk of course about the SMT/HT from Xeon processors up to 2.8Ghz now for
>>>>>>>those which have it enabled. For the P4 3.06Ghz and also Xeons of that and above
>>>>>>>things are a different matter.
>>>>>>
>>>>>>You keep saying that.  It continues to be _wrong_.  The 2.8 xeon has the
>>>>>>_exact_ same cpu core (and SMT) that the 3.06 xeon and PIV has.  And when I
>>>>>>say _exactly_ I mean _exactly_.  This is _directly_ from Intel... for the
>>>>>>record.
>>>>>
>>>>>try some better source instead of the marketing department try some hardware
>>>>>experts. for example at: http://www.realworldtech.com/index.cfm
>>>>
>>>>
>>>>I don't use "marketing types".  And I can send you some dual 3.06 xeon test
>>>>results that
>>>>mirror my dual 2.8's _exactly_ in terms of the 20% to 30% raw NPS figures.  I
>>>>sent my
>>>>"worst case positions" to someone with one of these machines and he got the
>>>>_same_
>>>>20% improvement at 3.06 that I got with my 2.8's.
>>>>
>>>>Your data is simply wrong.  The xeon core has _not_ changed from 2.8ghz to 3.06
>>>>ghz,
>>>>and I have no idea why you want to supply your "disinformation" that it has.
>>>>
>>>>We have four of these on the way (dual 3.06 dell 650s) for faculty.  They have
>>>>shipped
>>>>(2/28) so they should be here any time.  I'll run the tests and post the results
>>>>to further
>>>>debunk this "myth" that 3.06's are different...
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>Best regards,
>>>>>>>Vincent

Re: Hyper-Threading Technology from Intel-to Hype or Not to Hype? Sune Fischer 02:24:33 03/07/03
- Re: Hyper-Threading Technology from Intel-to Hype or Not to Hype? Sune Fischer 02:38:55 03/07/03
Re: Hyper-Threading Technology (more profile data) Robert Hyatt 13:59:34 03/06/03
- Re: Hyper-Threading Technology (more profile data) Keith Evans 07:27:50 03/07/03
  - Re: Hyper-Threading Technology (more profile data) Robert Hyatt 20:27:09 03/07/03
    - Re: Hyper-Threading Technology (more profile data) Keith Evans 20:44:04 03/07/03
      - Re: Hyper-Threading Technology (more profile data) Robert Hyatt 22:01:08 03/07/03
        
        Re: Hyper-Threading Technology (more profile data) Matt Taylor 23:54:29 03/07/03
        
        Re: Hyper-Threading Technology (more profile data) Robert Hyatt 06:49:46 03/08/03

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.