Author: Robert Hyatt
Date: 17:21:47 12/31/02
Go up one level in this thread
On December 31, 2002 at 13:57:13, Matt Taylor wrote: >On December 31, 2002 at 11:49:31, Vincent Diepeveen wrote: > >>On December 30, 2002 at 22:32:52, Robert Hyatt wrote: >> >>>On December 30, 2002 at 20:29:11, Vincent Diepeveen wrote: >>> >>>>On December 30, 2002 at 19:39:23, Frank Koenig wrote: >>>> >>>>>Two questions. >>>>> >>>>>One) Will Intel's HT technology be able to help chess programs above and beyond >>>>>just allowing one CPU to appear as two? >>>>> >>>>>Second) If you are running XP, will HT require XP Pro instead of XP Home to take >>>>>advantage of it? >>>>> >>>>>Thanks, >>>>> >>>>>Frank >>>> >>>>For dual machines you need even newer releases of OSes to still get >>>>released. >>>> >>>>However you can profit from it in a very limited way. It's a speedup of >>>>18% for DIEP at the latest P4 (3.06Ghz), at older P4s the profit is less >>>>(like P4 Xeon 2.8Ghz) and even older P4s the profit is zero or negative. >>> >>>Any chance you will _ever_ "test before talking"? >>> >>>The 2.8 xeon has the _same_ SMT core as the PIV/3.06. The _same_ means >>>"the same", not "something that is not as good as." >> >>http://www.realworldtech.com/index.cfm and ask intel designers themselves. >> >>>That is simply a crock statement that is nonsense. From _testing_ on >>>my part... >> >>I see a clear difference in performance. Intel managed to slowly improve SMT >>to what it is now. I do not find 18% impressive knowing the chip is already >>that much slower than the K7 for me. > >Um, they don't "fix" things that quickly. The Hyperthreading in the Pentium 4 >3.06 GHz is the same Hyperthreading in the Xeon 2.2, 2.4, 2.53, and 2.8 GHz >chips. > >You still seem to be hung up on ipc. The K7 had better ipc than the Pentium >Pro/2/3. The Pentium 4 is not designed to be efficient with work done per clock. >It is designed to ramp to high clock speeds, something the K7 will not do. The >K7 is nearing the end of its lifetime -- it's running more than 4 times faster >than its introductory speed. > >>> >>>> >>>>So it's progressing but the P4 is a processor not really mature enough: >>>>too little trace cache and too little datacache: just 1024 quadwords; >>> >>> >>>So? 12K micro-ops. 8kb data. Core-speed L2 cache with 512KB unified >>>cache. Seems to work quite well in all the testing I have done. >> >>If it is in theory simply 2 processors then 11% at older types and 18% at >>new P4 3.06ghz is not much and because of the small L1. Also i didn't >>figure out yet how big the branch prediction table (BTB) is in the P4 >>but it probably isn't so impressive. > >The BTB size doesn't affect that much. It is comparable, but I never concerned >myself with useless details. > >The small L1 is not the reason why the Pentium 4 with Hyperthreading only gets >11% or 18% when performance fairy decides to crank it up a notch. (Look for the >performance fairy enable option in your BIOS.) The Pentium 4 has 512 KB of L2 >cache -- more than any variant of the K7 has in total. L2 is not as fast as L1, >but it doesn't make a huge difference because it's a lot faster than main >memory. > >The real reason why the data changes from 11-18% across processors is because >you aren't accurately benchmarking. You can't run one test and call it >conclusive. > >>>>compare with the 64KB L1 data cache of a K7 which is i guess 16384 >>>>doublewords. >>> >>> >>>what is with all the quadword/doubleword nonsense? >> >>>I think _most_ here can figure out what 64 KB turns into in your favorite >>>data size... >> >>64KB of K7 and just 1024 words of P4. >> >>The P4 is using 64 bits adressing for the L1 that means just 1024 words. >>I prefer personally 16384 words of 32 bits. >> >>However the P4 doesn't deliver 2048 words of 32 bits. It delivers 1024 >>words of 64 bits. > >Um, what? Xeon has used 36 bits for L1 and L2 address tags since the days of the >Pentium Pro because of the PAE/PSE36 addressing extensions. The chips run on a >36-bit address bus, not a 64-bit address bus. You are making a terrible mistake in this discussion. Vincent doesn't know a thing about hardware. He is mixing terminology right and left. He mixes bus width with address bits repeatedly.. He _means_ transfering 64 bits per cycle over the L1-CPU bus. But even without all the terminology issues, the discussion is _hopeless_. Of course, you probably already realize that anyway... > >The cache is the cache, and the Pentium 4 and K7 caches are equally capable of >delivering bytes, two bytes, four bytes, eight bytes, or sixteen bytes on >respectively aligned address boundaries. The K7 has a line size of 64 bytes (not >bits), and the Pentium 4 has a line size of 128 bytes. Eugene clarified some >confusion about the Pentium 4 line size, but that is completely irrelevant here. > >-Matt Yes... L1 and L2 have different line sizes, which seems odd but probably makes sense to the architect guys...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.