Author: Robert Hyatt
Date: 15:03:44 11/04/02
Go up one level in this thread
On November 04, 2002 at 15:58:09, Gian-Carlo Pascutto wrote:
>On November 04, 2002 at 11:32:53, Robert Hyatt wrote:
>
>>>If it spends 20% of it's time for this (a realistic number
>>>on a high end P4) and the parallel speedup is 1.7 then it
>>>is going to run about 5% faster with SMT, roughly.
>>
>>Where does that "math" come from? (5%)
>
>Assuming the speedup comes from that 20% waiting time that
>can be eliminated, and your parallel efficiency is 1.7, I
>do 1.20*(1.70/2) and arrive at a speedup of eh, 2%.
>(I realize this is quite fuzzy math)
That looks _way_ too fuzzy for me. IE I would expect 70% of that "extra
processor" which using your case above, would be 14% improvement.
>
>>I have seen a 30% improvement
>>in NPS using hyper-threading on a 2.2ghz PIV. That should translate into
>>a roughly 20% improvement in search speed to a specific depth. That seemed
>>to be close to the numbers Eugene posted as well.
>
>Why is it so much? Is Crafty so memory-bound?
>
I think any program is memory bound to some degree. But memory is not the only
issue. IE instruction dependencies stall one instruction stream frequently,
such as needing
the result from a previous instruction in the current instruction. Filling in
these "gaps"
gains a good bit...
>>Once I have time to fiddle with the locks to add the pause, I expect even >better performance...
>
>I use this, works fine with Intel C
>
>__inline void Lock (volatile int *hPtr)
>{
> __asm
> {
> mov ecx, hPtr
> la: mov eax, 1
> xchg eax, [ecx]
> test eax, eax
> jz end
> lb: pause
> mov eax, [ecx]
> test eax, eax
> jz la
> jmp lb
> end:
>}
>
>--
>GCP
Mine wasn't liking "pause". that was the problem. I'll try it with the most
recent
release to see if that fixes it...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.