Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: FINAL ANSWER

Author: Gerd Isenberg

Date: 13:37:32 12/10/03

Go up one level in this thread


On December 10, 2003 at 15:57:12, Robert Hyatt wrote:

>On December 10, 2003 at 15:00:18, Gerd Isenberg wrote:
>
>>On December 10, 2003 at 09:08:05, Robert Hyatt wrote:
>>
>>>On December 10, 2003 at 01:10:23, Russell Reagan wrote:
>>>
>>>>On December 10, 2003 at 00:20:35, Slater Wold wrote:
>>>>
>>>>>144 - SuSe 8 - gcc 33 -m32 = 1109
>>>>>144 - SuSe 8 - gcc 33 -m64 = 1562
>>>>>
>>>>>41% going from 32 to 64 bit on Crafty!
>>>>>
>>>>>And others:
>>>>>
>>>>>144 - SuSe 8 - ICC 7.0 (32)= 1199
>>>>>144 - W2003E - ICC 7.0 (32)= 1230
>>>>
>>>>I think there are more questions to answer. One is the one you just answered,
>>>>which is how much of a speedup we can from the 64-bit compilation alone. Another
>>>>is how much of a speedup we get from the Opteron's hardware (ex. 32-bit Athlon
>>>>vs. 64-bit Athlon/Opteron).
>>>>
>>>>Another is how much of a speedup non-bitboard programs will get from the 64-bit
>>>>hardware and 64-bit compilation. Maybe someone could compile some non-bitboard
>>>>programs. I guess even TSCP's bench command might give us some answers.
>>>>
>>>>One question I have is, does the 32-bit gcc compilation on 64-bit hardware still
>>>>take advantage of all 16 general purpose registers? Or does it compile it for a
>>>>32-bit executable you could run on a 32-bit CPU?
>>>
>>>
>>>When you specify -m32, you get an X86 executable, which means no unusual
>>>registers or anything.  -m64 (default on the box I am testing on) adds
>>>both 64 bit registers and the extra 8 registers %r8-%r15...
>>
>>Yes, with x86-32 we have usually six or seven 32-bit registers,
>>eax,ebx,ecx,edx,esi,edi,(ebp), keeping up to three bitboards (most likely only
>>two). With x86-64 there are theoretically up to 15 32-bit as well as bitboard
>>registers - five times more registers for bitboards ;-)
>>
>>The drawback are additional instruction prefixes and 64-bit long addresses.
>>Therefore even longer direct data access instructions and doubled memory space
>>for storing pointer or references. I would prefere a tiny 32/64-bit mode with
>>32-bit addresses but all registers, with prefix 64-bit wide.
>
>Not there.  In fact, you can't even do a bsf %r8, %eax to get a 32 bit
>counter for a 64 bit value.  That surprised me, since bsf is not going
>to produce a result > 8 bits anyway.  :)
>
>I have not investigated (yet) what happens if you load a 32 bit value
>into a 32 bit register, then use the 64 bit register.  Do the upper
>32 bits get clobbered (expected) or left alone as in using ah/al in
>8 bit land?


AMD64 Technology
AMD64 Architecture
Programmer’s Manual
Volume 1:
Application Programming

page 33

Zero-Extension of 32-Bit Results.

As Figure 3-3 and Figure 3-4 show, when performing 32-bit operations with a GPR
destination in 64-bit mode, the processor zero-extends the 32-bit result into
the full 64-bit destination. 8-bit and 16-bit operations on GPRs preserve all
unwritten upper bits of the destination GPR. This is consistent with legacy
16-bit and 32-bit semantics for partial-width results.

Software should explicitly sign-extend the results of 8-bit, 16-bit, and 32-bit
operations to the full 64-bit width before using the results in 64-bit address
calculations.
...



>
>In any case, the box is interesting, and I'm drowning in details.  :)
>
>IE linux memory management is interesting.  You want the local stack to
>be in the memory attached to the processor you run on.  But you have to
>malloc() the stack before creating the thread.  But it turns out that
>malloc doesn't fill in page tables, that happens when the pages are
>first referenced, and I can tell linux "allocate local when possible"
>so that the stack faults into the local memory on the processor running
>that thread.  now I have to make sure to glue that thread to that
>processor.
>

sound interesting but dificult...

>It's loads of fun.  :)
>
>And I don't want to break non-NUMA SMP search either. :)



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.