Author: Matt Taylor
Date: 22:32:45 02/09/03
Go up one level in this thread
On February 10, 2003 at 00:44:13, Tom Kerrigan wrote: >On February 09, 2003 at 23:39:04, Matt Taylor wrote: > >>>Compilers that inline code and do "fastcalls" negate any benefit that register >>>windowing gives you. >>On an architecture like Sparc or IA-64 that gives you enough registers to do so. >>Let's start counting...I have 8 registers on IA-32...1 used as frame pointer...1 >>as stack pointer...3 get preserved by convention...hmm. I guess that leaves -3- >>registers for "fastcall" convention. This is why IA-32 usually doesn't even >>bother with fastcall. > >Sure, programmer visible registers. Doesn't the P6 have 40 rename registers? Who >knows how many the P4 has. I don't see your point. It's not programmer-visible, and it doesn't assist a fastcall convention. For the sake of discussion of performance, I am restricting my discussion of "compilers in general" to GCC, VC, and Intel C. They are without-a-doubt the most common and most popular compilers for IA-32. This does not even begin to mitigate the problems when I have a function that take 2 register parameters and need 3 temp registers for the computation. By convention, IA-32 compilers will pass parameters in eax, ecx, and edx. Conveniently the compilers also assume ebx, esi, edi, ebp, and obviously esp are preserved across function calls. The function has 1 free register and needs 2 more for computation, and usually it is not free to reuse the other 2 non-preserved registers. The function therefore stores two of esi, edi, or ebx on the stack (occasionally ebp is included too) and uses them. The compiler might also store the parameters on the stack, use 3 registers, and reload the values from the stack. In either case, it's lose-lose. I would assert that -most- fastcall functions across -most- applications may as well be inline. It is rare to see a lengthy algorithm that uses at most 3 variables. It's rare to see any algorithm that uses at most 3 variables. >>nothing by passing parameters in registers when the called function has to turn >>around and put them on the stack again because it needs registers for >>computation. > >Hmm. The computations that my functions do tend to require the arguments that >are passed to them. Read above description. >>Not always. Crafty scales with clock speed, and it consistently blows the cache. >>I can't explain that, but I haven't thought through it yet. > >Riiiiiiiiiight. You're going to be thinking for a very long time. > >What's the name of the program you've been using to read your processor's >performance registers? > >-Tom Evidence of Crafty blowing the cache was already given by Dr. Hyatt. -Matt
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.