Author: Eugene Nalimov
Date: 21:36:17 12/03/03
Go up one level in this thread
On December 03, 2003 at 23:58:52, Robert Hyatt wrote:
>On December 03, 2003 at 22:51:43, Sune Fischer wrote:
>
>>On December 03, 2003 at 16:58:17, Russell Reagan wrote:
>>
>>>On December 03, 2003 at 16:35:46, Slater Wold wrote:
>>>
>>>>What's the speedup between 1, 2, and 4 CPUs?
>>>
>>>After they (Bob and Eugene) did the NUMA stuff for Windows, 4 cpus was like a
>>>3.84x speedup.
>>>
>>>>Any idea on the speedup of going
>>>>to 64-bit?
>>>
>>>Clock for clock, Crafty is about 60% faster on 64-bit hardware. IE a 2GHz
>>>Opteron would run Crafty about 60% faster than a 2GHz 32-bit Athlon. Gian Carlo
>>>reported that Sjeng ran 70% faster, clock for clock.
>>
>>The Opteron has lots of improvements other than the 64 bit thing, so it is still
>>not exactly known what is contributing where for Crafty.
>>
>>I suspect Crafty would get a good speedup on a 32-bit Athlon too if it had 1 MB
>>cache and more registers, this should somehow be factored out.
>>
>>Granted that's not easy to do, but if/when we manage to take a handfull of
>>bitboard programs and compare their speedup to a handfull of non-bitboard
>>programs, then we might get a better impression of how much the 64-bit thing is
>>an issue on the overall.
>>
>>It is also possible that the first generation chess programs and compilers won't
>>be optimal. First tests are often 'worst case' senarios.
>>
>>-S.
>
>
>The thing that was most revealing was the 32 vs 64 bit stuff. Things like
>FirstOne() are a bit messy on 32 bit machines. On the Opteron it is dirt
>simple:
>
>int static __inline__ FirstOne(long word) {
> long dummy, dummy2;
> asm (
> " bsrq %0, %1" "\n\t"
> " jnz 1f" "\n\t"
> " movq $-1, %1" "\n\t"
> "1: movq $63, %0" "\n\t"
> " subq %1, %0" "\n\t"
> : "=r&" (dummy), "=r&" (dummy2)
> : "0" ((long) (word))
> : "cc");
> return (dummy);
>}
>
>bsrq is bsr for 64 bits. I use the "safe" version that does a test to see
>if no bits were set. If so, I skip the move -1 to a register and leave that
>register as set by bsfq. The 32 bit version is more than twice as long.
>I will get rid of the jump with a cmovq later, but I just didn't feel like
>fooling with it after I initially got it working.
Here conditional move will be slower.
Thanks,
Eugene
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.