Author: Robert Hyatt
Date: 14:07:42 12/10/03
Go up one level in this thread
On December 10, 2003 at 16:37:32, Gerd Isenberg wrote: >On December 10, 2003 at 15:57:12, Robert Hyatt wrote: > >>On December 10, 2003 at 15:00:18, Gerd Isenberg wrote: >> >>>On December 10, 2003 at 09:08:05, Robert Hyatt wrote: >>> >>>>On December 10, 2003 at 01:10:23, Russell Reagan wrote: >>>> >>>>>On December 10, 2003 at 00:20:35, Slater Wold wrote: >>>>> >>>>>>144 - SuSe 8 - gcc 33 -m32 = 1109 >>>>>>144 - SuSe 8 - gcc 33 -m64 = 1562 >>>>>> >>>>>>41% going from 32 to 64 bit on Crafty! >>>>>> >>>>>>And others: >>>>>> >>>>>>144 - SuSe 8 - ICC 7.0 (32)= 1199 >>>>>>144 - W2003E - ICC 7.0 (32)= 1230 >>>>> >>>>>I think there are more questions to answer. One is the one you just answered, >>>>>which is how much of a speedup we can from the 64-bit compilation alone. Another >>>>>is how much of a speedup we get from the Opteron's hardware (ex. 32-bit Athlon >>>>>vs. 64-bit Athlon/Opteron). >>>>> >>>>>Another is how much of a speedup non-bitboard programs will get from the 64-bit >>>>>hardware and 64-bit compilation. Maybe someone could compile some non-bitboard >>>>>programs. I guess even TSCP's bench command might give us some answers. >>>>> >>>>>One question I have is, does the 32-bit gcc compilation on 64-bit hardware still >>>>>take advantage of all 16 general purpose registers? Or does it compile it for a >>>>>32-bit executable you could run on a 32-bit CPU? >>>> >>>> >>>>When you specify -m32, you get an X86 executable, which means no unusual >>>>registers or anything. -m64 (default on the box I am testing on) adds >>>>both 64 bit registers and the extra 8 registers %r8-%r15... >>> >>>Yes, with x86-32 we have usually six or seven 32-bit registers, >>>eax,ebx,ecx,edx,esi,edi,(ebp), keeping up to three bitboards (most likely only >>>two). With x86-64 there are theoretically up to 15 32-bit as well as bitboard >>>registers - five times more registers for bitboards ;-) >>> >>>The drawback are additional instruction prefixes and 64-bit long addresses. >>>Therefore even longer direct data access instructions and doubled memory space >>>for storing pointer or references. I would prefere a tiny 32/64-bit mode with >>>32-bit addresses but all registers, with prefix 64-bit wide. >> >>Not there. In fact, you can't even do a bsf %r8, %eax to get a 32 bit >>counter for a 64 bit value. That surprised me, since bsf is not going >>to produce a result > 8 bits anyway. :) >> >>I have not investigated (yet) what happens if you load a 32 bit value >>into a 32 bit register, then use the 64 bit register. Do the upper >>32 bits get clobbered (expected) or left alone as in using ah/al in >>8 bit land? > > >AMD64 Technology >AMD64 Architecture >Programmer’s Manual >Volume 1: >Application Programming > >page 33 > >Zero-Extension of 32-Bit Results. > >As Figure 3-3 and Figure 3-4 show, when performing 32-bit operations with a GPR >destination in 64-bit mode, the processor zero-extends the 32-bit result into >the full 64-bit destination. 8-bit and 16-bit operations on GPRs preserve all >unwritten upper bits of the destination GPR. This is consistent with legacy >16-bit and 32-bit semantics for partial-width results. > >Software should explicitly sign-extend the results of 8-bit, 16-bit, and 32-bit >operations to the full 64-bit width before using the results in 64-bit address >calculations. >... > OK.. I know what to expect there, now. Bob > > >> >>In any case, the box is interesting, and I'm drowning in details. :) >> >>IE linux memory management is interesting. You want the local stack to >>be in the memory attached to the processor you run on. But you have to >>malloc() the stack before creating the thread. But it turns out that >>malloc doesn't fill in page tables, that happens when the pages are >>first referenced, and I can tell linux "allocate local when possible" >>so that the stack faults into the local memory on the processor running >>that thread. now I have to make sure to glue that thread to that >>processor. >> > >sound interesting but dificult... > >>It's loads of fun. :) >> >>And I don't want to break non-NUMA SMP search either. :)
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.