Author: Matt Taylor
Date: 10:16:30 02/11/03
Go up one level in this thread
Test...it seems this message disappeared, but I had it open in a window. I wonder if I can reply? On February 10, 2003 at 23:21:58, Robert Hyatt wrote: >On February 10, 2003 at 21:09:30, Matt Taylor wrote: > >>On February 10, 2003 at 14:44:40, Tom Kerrigan wrote: >> >>>On February 10, 2003 at 01:32:45, Matt Taylor wrote: >>> >>>>On February 10, 2003 at 00:44:13, Tom Kerrigan wrote: >>>> >>>>>On February 09, 2003 at 23:39:04, Matt Taylor wrote: >>>>> >>>>>>>Compilers that inline code and do "fastcalls" negate any benefit that register >>>>>>>windowing gives you. >>>>>>On an architecture like Sparc or IA-64 that gives you enough registers to do so. >>>>>>Let's start counting...I have 8 registers on IA-32...1 used as frame pointer...1 >>>>>>as stack pointer...3 get preserved by convention...hmm. I guess that leaves -3- >>>>>>registers for "fastcall" convention. This is why IA-32 usually doesn't even >>>>>>bother with fastcall. >>>>> >>>>>Sure, programmer visible registers. Doesn't the P6 have 40 rename registers? Who >>>>>knows how many the P4 has. >>>> >>>>I don't see your point. It's not programmer-visible, and it doesn't assist a >>>>fastcall convention. >>> >>>Why wouldn't the rename registers alleviate the problem with fastcalls just as >>>much as anything else? >> >>Because I don't have access to rename registers. So far as I know, they serve >>one purpose in IA-32: >> >>mov [mem], ecx >>mov ecx, blah > >Or this: > >mov eax, [something] >add eax, edx >imul eax, 25 >mov [something], eax > >mov eax, [somethingelse] >sub eax, edx >imul eax, 17 >add eax, 3 >mov [somethingelse],eax > >And after renaming, you end up with two sets of instructions that >can be executed in parallel without the apparent eax register conflict >since the second mov eax, [xx] renames eax to something new. > >Cute, and effective, and partially overcomes the paucity of registers on >ia32. "partially" being the operative word, however. :) > >>These are not dependant because the processor renames ecx in the second >>instruction to alias on a different internal register. The old value of ecx >>sticks around so old instructions can issue out-of-order and use the correct >>temporal value of ecx without delaying future instructions. >> >>This in no way, shape, or form alleviates the problem of fastcall. Dr. Hyatt >>said exactly what I was trying to say. >> >>On a side note -- I recall reading something in the P3 manual about using >>register renaming to make al, ah, ax, and eax appear differently. This was >>labelled as the cause for the infamous partial-register stall the P3 exhibits. >> > >Yes. When you think about it, trying to "piece" together parts of a register >from a group that have been "renamed" is a mess to think about. I think Intel >punted. Partial registers are fine if you use them consistently, but when you >try to fiddle with al in one group of instructions, ah in another, and then >ax or eax in yet another, it has to throw up its hands and stall the pipe to >put all the crap back together in one register... Yeah. I was disappointed on Athlon that I could not use al and ah in conjunction (when not referring to ax or eax). The processor decides that it needs to merge them and exhibits massive stalls. The thought just occured that the clear trick (xor reg, reg) on both registers might fix this. They actually address this issue in the x86-64. In 64-bit long mode, 32-bit computation is still the native machine word; a prefix overrides to 64-bits. A 32-bit computation will zero the upper half of its register rather than the old merge behavior. Even in 64-bit long mode, the processor will still merge al/ah/ax/eax, but no merging needs to be done for eax and rax. >I like the Cray. No partial registers. No reorder buffer. Just good, clean, >simple, and _fast_ instruction execution, if the asm programmer (or the >compiler) is worth anything at scheduling the instructions to avoid the >typical dependency delays... (no register renaming on a cray either...) Partial registers were nice when code had to be small and size-efficient. Even embedded systems have the luxury of high-level languages these days. x86 was always supposed to be clever, not fast. >>>Why are you arguing about IA-32 anyway? I don't even like IA-32. Fastcalls work >>>well with MIPS and Alpha, that's been well documented. And before you discount >>>IA-32, why not actually get some data on it before trying to convince yourself >>>one way or the other? >> >>I'm arguing about IA-32 because that's sort've what the thread was about. The >>original question was something to the tune of, "How do IA-64 and AA-64 compare >>to IA-32?" No MIPS. No Sparc. No HP-RISC. No Alpha. Just IA-32, IA-64, and >>AA-64. The only times I have deliberately brought Sparc into the picture was for >>comparison with IA-64. >> >>Dr. Hyatt is correct. I program x86 assembler on a daily basis. I do assembly >>optimization for x86 frequently. I know the x86 ISA inside and out -- to the >>extent that I can assemble a large number of opcodes in my head. >> >>I have written many fastcall functions. I know the problems that plague them. I >>have outlined one particular fault which you have repeatedly ignored. You still >>have no valid answer; both Dr. Hyatt and I have corrected you -- the rename >>registers are not available to the application, and they have no tangible effect >>on a register-passing scheme for x86. Now your answer is that I am ignorant? >> >>-Matt > >I would hope he doesn't believe that. I know a lot about a lot of computer >architectures. But (until fairly recently) I haven't really paid a lot of >attention to the ia32 assembly stuff. But it has gotten me interested, and >I am learning more daily. But I would _always_ defer to those that do this >on a daily basis. Yourself and Eugene come to mind for starters. :) -Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.