Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hyper Threading and Chess

Author: Robert Hyatt

Date: 17:21:47 12/31/02

Go up one level in this thread


On December 31, 2002 at 13:57:13, Matt Taylor wrote:

>On December 31, 2002 at 11:49:31, Vincent Diepeveen wrote:
>
>>On December 30, 2002 at 22:32:52, Robert Hyatt wrote:
>>
>>>On December 30, 2002 at 20:29:11, Vincent Diepeveen wrote:
>>>
>>>>On December 30, 2002 at 19:39:23, Frank Koenig wrote:
>>>>
>>>>>Two questions.
>>>>>
>>>>>One) Will Intel's HT technology be able to help chess programs above and beyond
>>>>>just allowing one CPU to appear as two?
>>>>>
>>>>>Second) If you are running XP, will HT require XP Pro instead of XP Home to take
>>>>>advantage of it?
>>>>>
>>>>>Thanks,
>>>>>
>>>>>Frank
>>>>
>>>>For dual machines you need even newer releases of OSes to still get
>>>>released.
>>>>
>>>>However you can profit from it in a very limited way. It's a speedup of
>>>>18% for DIEP at the latest P4 (3.06Ghz), at older P4s the profit is less
>>>>(like P4 Xeon 2.8Ghz) and even older P4s the profit is zero or negative.
>>>
>>>Any chance you will _ever_ "test before talking"?
>>>
>>>The 2.8 xeon has the _same_ SMT core as the PIV/3.06.  The _same_ means
>>>"the same", not "something that is not as good as."
>>
>>http://www.realworldtech.com/index.cfm  and ask intel designers themselves.
>>
>>>That is simply a crock statement that is nonsense.  From _testing_ on
>>>my part...
>>
>>I see a clear difference in performance. Intel managed to slowly improve SMT
>>to what it is now. I do not find 18% impressive knowing the chip is already
>>that much slower than the K7 for me.
>
>Um, they don't "fix" things that quickly. The Hyperthreading in the Pentium 4
>3.06 GHz is the same Hyperthreading in the Xeon 2.2, 2.4, 2.53, and 2.8 GHz
>chips.
>
>You still seem to be hung up on ipc. The K7 had better ipc than the Pentium
>Pro/2/3. The Pentium 4 is not designed to be efficient with work done per clock.
>It is designed to ramp to high clock speeds, something the K7 will not do. The
>K7 is nearing the end of its lifetime -- it's running more than 4 times faster
>than its introductory speed.
>
>>>
>>>>
>>>>So it's progressing but the P4 is a processor not really mature enough:
>>>>too little trace cache and too little datacache: just 1024 quadwords;
>>>
>>>
>>>So?  12K micro-ops.  8kb data.  Core-speed L2 cache with 512KB unified
>>>cache.  Seems to work quite well in all the testing I have done.
>>
>>If it is in theory simply 2 processors then 11% at older types and 18% at
>>new P4 3.06ghz is not much and because of the small L1. Also i didn't
>>figure out yet how big the branch prediction table (BTB) is in the P4
>>but it probably isn't so impressive.
>
>The BTB size doesn't affect that much. It is comparable, but I never concerned
>myself with useless details.
>
>The small L1 is not the reason why the Pentium 4 with Hyperthreading only gets
>11% or 18% when performance fairy decides to crank it up a notch. (Look for the
>performance fairy enable option in your BIOS.) The Pentium 4 has 512 KB of L2
>cache -- more than any variant of the K7 has in total. L2 is not as fast as L1,
>but it doesn't make a huge difference because it's a lot faster than main
>memory.
>
>The real reason why the data changes from 11-18% across processors is because
>you aren't accurately benchmarking. You can't run one test and call it
>conclusive.
>
>>>>compare with the 64KB L1 data cache of a K7 which is i guess 16384
>>>>doublewords.
>>>
>>>
>>>what is with all the quadword/doubleword nonsense?
>>
>>>I think _most_ here can figure out what 64 KB turns into in your favorite
>>>data size...
>>
>>64KB of K7 and just 1024 words of P4.
>>
>>The P4 is using 64 bits adressing for the L1 that means just 1024 words.
>>I prefer personally 16384 words of 32 bits.
>>
>>However the P4 doesn't deliver 2048 words of 32 bits. It delivers 1024
>>words of 64 bits.
>
>Um, what? Xeon has used 36 bits for L1 and L2 address tags since the days of the
>Pentium Pro because of the PAE/PSE36 addressing extensions. The chips run on a
>36-bit address bus, not a 64-bit address bus.

You are making a terrible mistake in this discussion.  Vincent doesn't know
a thing about hardware.  He is mixing terminology right and left.  He mixes
bus width with address bits repeatedly..  He _means_ transfering 64 bits per
cycle over the L1-CPU bus.  But even without all the terminology issues,
the discussion is _hopeless_.

Of course, you probably already realize that anyway...




>
>The cache is the cache, and the Pentium 4 and K7 caches are equally capable of
>delivering bytes, two bytes, four bytes, eight bytes, or sixteen bytes on
>respectively aligned address boundaries. The K7 has a line size of 64 bytes (not
>bits), and the Pentium 4 has a line size of 128 bytes. Eugene clarified some
>confusion about the Pentium 4 line size, but that is completely irrelevant here.
>
>-Matt

Yes...  L1 and L2 have different line sizes, which seems odd but probably
makes sense to the architect guys...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.