Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: OT: P4- 3 GHz with hyper-threading

Author: Robert Hyatt

Date: 09:56:06 11/03/02

Go up one level in this thread


On November 03, 2002 at 03:29:09, Gian-Carlo Pascutto wrote:

>On November 03, 2002 at 02:00:41, Anthony Cozzie wrote:
>
>>If I understand SMT correctly, its even more than that.  A modern processor has
>>a lot of functional units lying around.  An Athlon can in theory execute 3
>>branches, 3 integer instructions, and 3 floating point instructions every
>>cycle. In reality, most of the time those units are just sitting around. One
>>of the ideas behind SMT is that you can run 2 threads, and split the
>>functional units between them.
>
>The problem with this (and the reason I was surprised SMT works) is that
>it only has 3 decoders and a single cache that is used by both processes.
>
>It is somewhat irrelevant that you have 9 function units if your
>processor is decoder-limited (true for most modern cpus). I guess
>all improvement from SMT is because of memory waiting as Robert
>describes.
>
>--
>GCP


The issue is any sort of instruction stream like this:

opcode    eax, something
opcode    something, eax

The second can't execute until the first finishes.  If the first is a memory
read, then several hundred cycles are going to pass before the data arrives.
But it could also be a long instruction time issue as well.  It if takes the
first instruction 10 cycles, then there will be 10 dead cycles before the
second can start.  That is 10 cycles another thread could soak up if you are
lucky...

Of course, for some programs, this doesn't help a bit.  And if you use
spinlocks, it can even get worse...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.