Author: Robert Hyatt
Date: 08:46:05 12/03/02
Go up one level in this thread
On December 02, 2002 at 23:21:23, Matt Taylor wrote: ><snip> >>>Does hyper-threading really help that much? It seems like it would create more >>>contention for limited resources (decoder, internal u-op cache, even some >>>execution units). I would be extremely interested in seeing hyperthreading >>>benchmarks with Crafty. >> >> >>First, if you look at the concept of trace-cache, it is _behind_ the decoder, >>and all it stores are decoded instructions (micro-ops). Since Crafty uses the >>_same_ code in all threads, it is likely that the shared L1 I-cache (and the >>L1 D-cache and L2 cache) will all contain stuff that is useful across the two >>threads... > >Yes, but the figures Intel lists are 1 instruction decoded per cycle and up to 3 >supplied by the trace cache. I suppose hyperthreading would make no sense unless >they doubled the front-end of the pipeline (the trace cache and decoder). > >Come to think of it, the P4 Xeon may very well see enormous gains from >hyperthreading as it would unlock the full potential of the chip. P4 from the >start has been limited to at most 3 ops/cycle from the trace cache assuming that >the code you want is actually IN the trace cache. It is equipped with 7 >execution units. However, two of the ALUs are double-pumped allowing for up to 5 >simple ALU ops/cycle (total of 9 ops/cycle). It should be painfully clear that >under no circumstances can the full 5 ops/cycle be used -- by a single trace >cache, anyway. If they have a second trace cache, P4 Xeon may very well see >nearly twice the performance in hyperthreading... > >>Eugene already ran some and posted the results. The raw NPS went up by a >>factor of 1.3X. I think more can be had but at a couple of critical places >>where I have a "busy spin" I need to insert a "pause" asm instruction so that >>the cpu will work on the thread doing useful work if there is a choice... > >How can you use the hlt instruction? It's privileged, and you're in ring 3. Not "halt". "pause". It is a no-op on non-hyperthreaded CPUS. and all it does is to cause the internal "thread scheduler" to execute the other thread until it blocks. > >Intel claims about 30-40% speed gains from hyperthreading, but that makes the >assumption that different instructions are utilized across different types of >applications. I would also guess that it falls into a sort of resonance where >one application is doing its heavy computation while the other utilizes the >memory bus. Correct... Or anything that makes _both_ threads block a fair amount, such as waiting on memory, on results from a memory-mapped read to a device controller, etc... if both are blocking, they interleave nicely and go 2x faster. > >I'm not sure if the P3 ever shipped with hyperthreading, though I recall hearing >about it in the days of the P3. The last Intel chip I bought was my Pentium 120. >Has it been tested on a P3 Xeon with hyperthreading by any chance? No idea. I _think_ it started with the PIV, but am not sure. The CPUID instruction will give an indication in the processor capability bitmap it returns...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.