Author: Robert Hyatt
Date: 08:44:20 01/06/04
Go up one level in this thread
On January 06, 2004 at 10:56:00, martin fierz wrote: >On January 06, 2004 at 09:40:33, Robert Hyatt wrote: > >>Maybe there is some confusion here. There is _no_ cost for "allocating" >>local variables in a function. And I _do_ mean _NO_. These values are >>simply "on the stack" and when the function is entered, the stack pointer has >>some constant subtracted from it to leave a "hole" that is used to hold local >>function variables. no malloc() or anything like that is done by modern >>compilers, hence zero cost. > >hi bob, > >is this really true? i used to have a HUGE structure describing a move in the >first version of my chess program (don't ask, please...). in my negamax function >i had this > >int negamax(int alpha, int beta, int depth) > { > MOVE movelist[MAXMOVES]; > ...other variables... > > ...typical negamax code... > > } > >i saw clear speed differences depending on the value i used for MAXMOVES - and >in my tests, i made sure that using 100 would never overflow in the positions i >was searching, then i set it to e.g. 200 and the program would get a few % >slower, although these additional 100 array elements were never used. > >how can this be explained? i'm not memsetting the array to 0 or anything. > Cache. The stack depends on the cache to stay "fast". If you bump the stack pointer, you are "skipping over memory" and moving the real stack further down in memory. This then has to be loaded into cache as well, displacing something else. however, it should not hurt a thing, with the proviso that _large_ things can cause several bursts of memory allocation as the stack grows and the O/S has to allocate new real pages of memory to contain it. But once you use that big stack once, you should be set for "life" (until the program terminates). >like uri, i thought the program was allocating memory every time it entered the >function, and that that was taking time. i ended up making my MOVE variable much >smaller - although it's still much larger than crafty's :-) See my response to Uri. Here is another easy to follow example from a Linux compile. The function is this: int global[16]; int func() { int foo[16], i; foo[0]=global[0]; for (i=1;i<16;i++) foo[i]=foo[i-1]*global[i]; return foo[15]; } here is what the compiler produces (.s) non-optimized: func: pushl %ebp ;save ebp movl %esp, %ebp ;copy esp to ebp subl $88, %esp subtract 88 from esp that handles 64 bytes or 16 words for foo[16], 4 bytes for i note that ebp-72 is first byte of foo ebp-76 is 4 bytes for i I'd have to think a bit to figure out why it saved 88 bytes of stack. movl global, %eax movl %eax, -72(%ebp) movl $1, -76(%ebp) .L2: cmpl $15, -76(%ebp) jle .L5 jmp .L3 .L5: movl -76(%ebp), %ecx movl -76(%ebp), %eax movl -76(%ebp), %edx movl -76(%ebp,%eax,4), %eax imull global(,%edx,4), %eax movl %eax, -72(%ebp,%ecx,4) leal -76(%ebp), %eax incl (%eax) jmp .L2 .L3: movl -12(%ebp), %eax leave ret I've added a few comments above... > >i thought perhaps i should allocate the movelist beforehand as a global, like >this: > >MOVE *movelist[MAX_SEARCH_DEPTH] >for(i=0;i<MAX_SEARCH_DEPTH;i++) > movelist[i] = malloc(sizeof(MOVE)*MAXMOVES) > >and then in the negamax function use the appropriate movelist for that search >depth. that turned out to be significantly slower than the local variable >approach. again, i don't understand why that would be... - anyone? > >cheers > martin
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.