Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Hammer info. And som SMP musings.

Author: Vincent Diepeveen

Date: 05:19:00 03/25/02

Go up one level in this thread


On March 24, 2002 at 17:28:40, Tom Kerrigan wrote:

>On March 24, 2002 at 10:37:02, Vincent Diepeveen wrote:
>
>>complete nonsense. a single P4 can't even outgun a single cpu MP K7.
>
>What planet are you on? I didn't mention Athlons ONCE in my post.
>
>>dual MP K7 1.2Ghz for DIEP. With 3 instructions a clock, 12KB instruction
>>trace cache and 1024 words for L1 datacache, it is of course insane to
>>run another process on a P4 processor at the same time.
>>The whole SMT is interesting for the future, but complete nonsense for p4.
>>Just a marketing hype. For sure not a single P4 can ever profit from it.
>
>And I'm sure you have measurements to back up this assertion?

You wrote a parallel chessprogram yet?

>>Note that 'thread' gets confused by process here too. Processes can
>>execute something, threads are all *forced* to do things indirect
>>using extra registers as indirection, so for me there is a clear
>>speeddifference already between the 2. Let's skip that difference
>
>The only difference between a thread and a process is the memory they use.
>Intel's SMT presents the chip as two unique processors anyway.

And the indirection each process is using when you look at the assembly.
that's loads of extra instructions!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

multithreading is hell slower than multiprocessing!!!!!

>>Whatever the 'idle' time on processor A, process A1 is simply executing
>>way way faster than A2. Also A2 is completely trashing the 1024 words
>>L1 data cache and small 12kb iop tracecache
>
>How do you know that the processes don't get equal CPU time, and that two chess
>threads would thrash the caches?

1024 words L1 cache for a program with 4MB datastructure,
DO THE MATH yourself.

Any chessprogram that's using less than 1024 words is not going
to get allowed to the world championships without paying Levy loads
of money.

>>Also i have no idea what takes more system time of the processor for my
>>program, but i assume it is the decoding of integer instructions and
>>branches and most important the 3 instructions a clock limit.

>How do you know your ILP is being limited by EUs? The P4 averages less than one
>instruction per clock when running most programs, and you're saying your program
>is different by a factor of 3?

I didn't know you wrote your software *that* bad.

'most programs' aren't chess programs. Even at the pentiumpro it was
already near 2 instructions a clock for the 'average' program as measured
by INTEL. Sincethen processor+compilers relatively only got faster for me.

At K7 MP with visual c++ 6.0 sp4 + processor pack, i'm 100% sure i'm
more closer to or above 2, than under 1.

Anyhow. SMT is not working for the P4.

It is loading a mule with a jumbo cargo.

Need a NEW processor for SMT. a thousand times more advanced than
the P4 is. The P4 is a failure in all respects except marketing.

Even overclocking a P3 is cheaper, safer and... ...faster...

>-Tom



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.