Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hyper Threading and Chess

Author: Matt Taylor

Date: 10:57:13 12/31/02

Go up one level in this thread


On December 31, 2002 at 11:49:31, Vincent Diepeveen wrote:

>On December 30, 2002 at 22:32:52, Robert Hyatt wrote:
>
>>On December 30, 2002 at 20:29:11, Vincent Diepeveen wrote:
>>
>>>On December 30, 2002 at 19:39:23, Frank Koenig wrote:
>>>
>>>>Two questions.
>>>>
>>>>One) Will Intel's HT technology be able to help chess programs above and beyond
>>>>just allowing one CPU to appear as two?
>>>>
>>>>Second) If you are running XP, will HT require XP Pro instead of XP Home to take
>>>>advantage of it?
>>>>
>>>>Thanks,
>>>>
>>>>Frank
>>>
>>>For dual machines you need even newer releases of OSes to still get
>>>released.
>>>
>>>However you can profit from it in a very limited way. It's a speedup of
>>>18% for DIEP at the latest P4 (3.06Ghz), at older P4s the profit is less
>>>(like P4 Xeon 2.8Ghz) and even older P4s the profit is zero or negative.
>>
>>Any chance you will _ever_ "test before talking"?
>>
>>The 2.8 xeon has the _same_ SMT core as the PIV/3.06.  The _same_ means
>>"the same", not "something that is not as good as."
>
>http://www.realworldtech.com/index.cfm  and ask intel designers themselves.
>
>>That is simply a crock statement that is nonsense.  From _testing_ on
>>my part...
>
>I see a clear difference in performance. Intel managed to slowly improve SMT
>to what it is now. I do not find 18% impressive knowing the chip is already
>that much slower than the K7 for me.

Um, they don't "fix" things that quickly. The Hyperthreading in the Pentium 4
3.06 GHz is the same Hyperthreading in the Xeon 2.2, 2.4, 2.53, and 2.8 GHz
chips.

You still seem to be hung up on ipc. The K7 had better ipc than the Pentium
Pro/2/3. The Pentium 4 is not designed to be efficient with work done per clock.
It is designed to ramp to high clock speeds, something the K7 will not do. The
K7 is nearing the end of its lifetime -- it's running more than 4 times faster
than its introductory speed.

>>
>>>
>>>So it's progressing but the P4 is a processor not really mature enough:
>>>too little trace cache and too little datacache: just 1024 quadwords;
>>
>>
>>So?  12K micro-ops.  8kb data.  Core-speed L2 cache with 512KB unified
>>cache.  Seems to work quite well in all the testing I have done.
>
>If it is in theory simply 2 processors then 11% at older types and 18% at
>new P4 3.06ghz is not much and because of the small L1. Also i didn't
>figure out yet how big the branch prediction table (BTB) is in the P4
>but it probably isn't so impressive.

The BTB size doesn't affect that much. It is comparable, but I never concerned
myself with useless details.

The small L1 is not the reason why the Pentium 4 with Hyperthreading only gets
11% or 18% when performance fairy decides to crank it up a notch. (Look for the
performance fairy enable option in your BIOS.) The Pentium 4 has 512 KB of L2
cache -- more than any variant of the K7 has in total. L2 is not as fast as L1,
but it doesn't make a huge difference because it's a lot faster than main
memory.

The real reason why the data changes from 11-18% across processors is because
you aren't accurately benchmarking. You can't run one test and call it
conclusive.

>>>compare with the 64KB L1 data cache of a K7 which is i guess 16384
>>>doublewords.
>>
>>
>>what is with all the quadword/doubleword nonsense?
>
>>I think _most_ here can figure out what 64 KB turns into in your favorite
>>data size...
>
>64KB of K7 and just 1024 words of P4.
>
>The P4 is using 64 bits adressing for the L1 that means just 1024 words.
>I prefer personally 16384 words of 32 bits.
>
>However the P4 doesn't deliver 2048 words of 32 bits. It delivers 1024
>words of 64 bits.

Um, what? Xeon has used 36 bits for L1 and L2 address tags since the days of the
Pentium Pro because of the PAE/PSE36 addressing extensions. The chips run on a
36-bit address bus, not a 64-bit address bus.

The cache is the cache, and the Pentium 4 and K7 caches are equally capable of
delivering bytes, two bytes, four bytes, eight bytes, or sixteen bytes on
respectively aligned address boundaries. The K7 has a line size of 64 bytes (not
bits), and the Pentium 4 has a line size of 128 bytes. Eugene clarified some
confusion about the Pentium 4 line size, but that is completely irrelevant here.

-Matt



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.