Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hyper-Threading Technology from Intel-to Hype or Not to Hype?

Author: Vincent Diepeveen

Date: 07:51:30 03/05/03

Go up one level in this thread


On March 05, 2003 at 10:37:14, Nolan Denson wrote:

>If that is so ... and you believe it is so ... I am happy you found your
>favorite cpu,  But its all a game and they give you a little at a time. The
>research that is done is like 10 yrs a head.  You will see some great
>improvements with the next P4 core (Prescott New Instructions & Hyper-Threading
>Improvements).  But keep your eye's on another chip maker (TMTA).. I think there
>engineers finally got a winner. I seem to remember reading something about they
>got something that can do 8 instructions a clock cycle.

don't brag nonsense Nolan. It is not about what chip is out in 10 years of time.
Because in 10 years of time the competition has something new too.

It is about what is *now* fast for me. Tests show simply the P4 is a lot slower
than the K7 for *any* chess program tested. That includes crafty. that includes
fritz that includes every program tested at very cheap dual XPs. The NPS i see
in the log files for DIEP tested is way above what i get here and what any P4
gets too.

When a new cpu gets released i simply test it when it is on the market. bragging
about something that you can't buy yet is a big nonsense. Looking forward to
faster hardware is cool but it is a big BS statement to say that it is dumb to
believe that at *this* moment for x86 hardware the P4 is fast. You can easily
test it and tests show the XP is a lot faster. Period.

Of course not a single new P4 chip is going to do 8 instructions a clock anyway.
Not even the McKinley is doing that. The McKinley, a very high end cpu clocked
at 1 Ghz, is doing 6 instructions a clock. The weakest chain of the P4 is not
its 3 instructions a clock, but its limited trace cache in combination with a
decode, big branch prediction penalty and very small L1 d-cache. Speed of the
cache is ok.

In big software programs you need trivially that a cpu can decode more than 1
instruction a clock. That's one of its weak points.

Everyone understands that doing 8 instructions a clock simultaneously is a big
nonsense then. It's decoding 1 instruction a clock. From within trace cache it
can do 3 instructions a clock. New Prescott is going to do 4. Not 8.

And even that is no big deal when they do not manage to get that decoding each
clock scaled up.

Even them it is doubtful whether they will be capable of beating opteron.

Very doubtful. Usually cpu's get improved just slightly. Improving the P4 from a
very slow IPC processor to big IPC is not logical. It would mean the previous
engineers did a very bad job and should have been fired. A slow progress is more
logical. Slow progress means like 8KB dcache in current P4s and 16KB dcache in
next generation P4s. So perhaps prescott has 16KB dcache which is quite some
progress because 8KB now for dcache is a very weak chain. IMproving weak chains
is always very good.

For SMT/HT however to perform well, they need way more than 16KB dcache. They
need for sure more than decoding 1 instruction for 2 cpu's a clock.

We will see what the future brings. If AMD ever manages to put SMT/HT onto their
cpu's they'll rock. That cpu is a lot more suited for such a feature.

Best regards,
Vincent





This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.