Author: Robert Hyatt
Date: 09:56:06 11/03/02
Go up one level in this thread
On November 03, 2002 at 03:29:09, Gian-Carlo Pascutto wrote: >On November 03, 2002 at 02:00:41, Anthony Cozzie wrote: > >>If I understand SMT correctly, its even more than that. A modern processor has >>a lot of functional units lying around. An Athlon can in theory execute 3 >>branches, 3 integer instructions, and 3 floating point instructions every >>cycle. In reality, most of the time those units are just sitting around. One >>of the ideas behind SMT is that you can run 2 threads, and split the >>functional units between them. > >The problem with this (and the reason I was surprised SMT works) is that >it only has 3 decoders and a single cache that is used by both processes. > >It is somewhat irrelevant that you have 9 function units if your >processor is decoder-limited (true for most modern cpus). I guess >all improvement from SMT is because of memory waiting as Robert >describes. > >-- >GCP The issue is any sort of instruction stream like this: opcode eax, something opcode something, eax The second can't execute until the first finishes. If the first is a memory read, then several hundred cycles are going to pass before the data arrives. But it could also be a long instruction time issue as well. It if takes the first instruction 10 cycles, then there will be 10 dead cycles before the second can start. That is 10 cycles another thread could soak up if you are lucky... Of course, for some programs, this doesn't help a bit. And if you use spinlocks, it can even get worse...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.