Author: Eugene Nalimov
Date: 16:10:46 01/30/03
Go up one level in this thread
Once again: I put "__assume(0)" in the "default" switch case. That's MS extensions, and it tells the compiler your promise control will never come here. Compiler used that fact when it generated code for switch statement -- there are no extra comparisons/branches in it. In any case, code generated by Symantec C was never been known to be good. Thanks, Eugene On January 30, 2003 at 17:35:50, Ed Schröder wrote: >On January 30, 2003 at 16:28:30, Eugene Nalimov wrote: > >>On January 30, 2003 at 14:34:34, Ed Schröder wrote: >> >>>On January 30, 2003 at 08:36:55, Russell Reagan wrote: >>> >>>>On January 30, 2003 at 07:02:39, Ed Schröder wrote: >>>> >>>>>On January 29, 2003 at 23:29:36, Russell Reagan wrote: >>>>> >>>>>>http://members.home.nl/matador/chess840.htm#INTRO >>>>>> >>>>>>From Ed's page... >>>>>> >>>>>>switch (piece_type) { case 0 : goto empty; >>>>>> case 1 : goto white_pawn; // evaluate white pawn >>>>>> case 2 : goto white_knight; // evaluate white knight >>>>>> case 3 : goto white_bishop; >>>>>> case 4 : goto white_rook; >>>>>> case 5 : goto white_queen; >>>>>> case 6 : goto white_king; >>>>>> case 7 : goto black_pawn; // evaluate black pawn >>>>>> case 8 : goto black_knight; >>>>>> case 9 : goto black_bishop; >>>>>> case 10 : goto black_rook; >>>>>> case 11 : goto black_queen; >>>>>> case 12 : goto black_king; } >>>>> >>>>>Any reasonable compiler will translate the above into 2 assembler statements, >>>>>someling like: >>>>> >>>>> mov EAX, dword ptr piece_type >>>>> JMP TABLE [EAX] >>>>> >>>>>Nothing can beat that. Just generate an ASM file to see it work. >>>>> >>>>>Explanation: the trick is that the compiler will generate an internal table (not >>>>>visible for the programmer) where it calculates all the effective addresses of >>>>>the labels mentioned in the switch/case statement. >>>>> >>>>>Then using the "piece_type" in register EAX it does an "indirect jump", only a >>>>>few cycles. >>>>> >>>>>Of course, the sequence must be in reasonable order otherwise the compiler will >>>>>not recognize the possibility. >>>>> >>>>>Ed >>>> >>>> >>>>Hi Ed, >>>> >>>>Let me see if I understand. It is an indirect jump, which will be at least as >>>>slow as a mispredicted conditional. The reason this is faster is because since >>>>you have 13 possible values for piece_type, you do ONE indirect jump as opposed >>>>to (potentially) 12 mispredicted conditionals. Is this your reasoning? >>> >>>You got it! >>> >>>Ed > > > >>Source file: >> >>int foo (int i) >>{ >> switch (i) { >> case 0: >> return 100; >> case 1: >> return 101; >> case 2: >> return 102; >> case 3: >> return 103; >> case 4: >> return 104; >> case 5: >> return 105; >> case 6: >> return 106; >> case 7: >> return 107; >> case 8: >> return 108; >> case 9: >> return 109; >> case 10: >> return 110; >> case 11: >> return 111; >> default: >> __assume(0); >> } >>} > > >Eugene, the issue is about "jumps"... > >The code file generated by Symantec 7.2 > >includelib SDX.lib >_TEXT segment dword use32 public 'CODE' ;size is 26 >_TEXT ends >_DATA segment dword use32 public 'DATA' ;size is 48 >_DATA ends >CONST segment dword use32 public 'CONST' ;size is 0 >CONST ends >_BSS segment dword use32 public 'BSS' ;size is 172 >_BSS ends >DGROUP group CONST,_BSS,_DATA > comm near _fp:byte:04h > comm near _fpd:byte:04h > comm near _fp3:byte:04h > extrn __acrtused > > public _main >_TEXT segment > assume CS:_TEXT >_main: > push EBX > mov EBX,DGROUP:_BSS[00h] > sub EBX,2 > cmp EBX,0Bh > ja L16 > jmp dword ptr DGROUP:_DATA[00h][EBX*4] >L16: xor EAX,EAX > pop EBX > ret >_TEXT ends >_DATA segment > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] > dd offset _TEXT:_main[016h] >_DATA ends >CONST segment >CONST ends >_BSS segment >_BSS ends > end > >============= > >My best, > >Ed > > > >>Code emitted by the compiler: >> >>; Listing generated by Microsoft (R) Optimizing Compiler Version 13.10.2179 >> >> TITLE s.c >> .386P >>include listing.inc >>if @Version gt 510 >>.model FLAT >>else >>_TEXT SEGMENT PARA USE32 PUBLIC 'CODE' >>_TEXT ENDS >>_DATA SEGMENT DWORD USE32 PUBLIC 'DATA' >>_DATA ENDS >>CONST SEGMENT DWORD USE32 PUBLIC 'CONST' >>CONST ENDS >>_BSS SEGMENT DWORD USE32 PUBLIC 'BSS' >>_BSS ENDS >>$$SYMBOLS SEGMENT BYTE USE32 'DEBSYM' >>$$SYMBOLS ENDS >>_TLS SEGMENT DWORD USE32 PUBLIC 'TLS' >>_TLS ENDS >>FLAT GROUP _DATA, CONST, _BSS >> ASSUME CS: FLAT, DS: FLAT, SS: FLAT >>endif >> >>INCLUDELIB LIBC >>INCLUDELIB OLDNAMES >> >>PUBLIC @foo@4 >>; Function compile flags: /Ogty >>_TEXT SEGMENT >>@foo@4 PROC NEAR >>; _i$ = ecx >>; File c:\repro\s.c >>; Line 3 >> jmp DWORD PTR $L568[ecx*4] >>$L546: >>; Line 5 >> mov eax, 100 ; 00000064H >>; Line 31 >> ret 0 >>$L547: >>; Line 7 >> mov eax, 101 ; 00000065H >>; Line 31 >> ret 0 >>$L548: >>; Line 9 >> mov eax, 102 ; 00000066H >>; Line 31 >> ret 0 >>$L549: >>; Line 11 >> mov eax, 103 ; 00000067H >>; Line 31 >> ret 0 >>$L550: >>; Line 13 >> mov eax, 104 ; 00000068H >>; Line 31 >> ret 0 >>$L551: >>; Line 15 >> mov eax, 105 ; 00000069H >>; Line 31 >> ret 0 >>$L552: >>; Line 17 >> mov eax, 106 ; 0000006aH >>; Line 31 >> ret 0 >>$L553: >>; Line 19 >> mov eax, 107 ; 0000006bH >>; Line 31 >> ret 0 >>$L554: >>; Line 21 >> mov eax, 108 ; 0000006cH >>; Line 31 >> ret 0 >>$L555: >>; Line 23 >> mov eax, 109 ; 0000006dH >>; Line 31 >> ret 0 >>$L556: >>; Line 25 >> mov eax, 110 ; 0000006eH >>; Line 31 >> ret 0 >>$L557: >>; Line 27 >> mov eax, 111 ; 0000006fH >>; Line 31 >> ret 0 >> npad 1 >>$L568: >> DD $L546 >> DD $L547 >> DD $L548 >> DD $L549 >> DD $L550 >> DD $L551 >> DD $L552 >> DD $L553 >> DD $L554 >> DD $L555 >> DD $L556 >> DD $L557 >>@foo@4 ENDP >>_TEXT ENDS >>END >> >>Thanks, >>Eugene >> >>> >>> >>>>Thanks, >>>>Russell >>>> >>>> >>>> >>>>> >>>>>>On one portion of Ed's discussion of Rebel (see above), he talks about using >>>>>>"indirect addressing". I get the impression from Ed's words that this method is >>>>>>supposed to fast. I understand his discussion to mean that if you create a >>>>>>switch statement like he does, you create a jump table and avoid a bunch of >>>>>>conditionals. >>>>>> >>>>>>However, in past discussions, I recall hearing that using a function pointer is >>>>>>going to be at least as slow as conditional, so I asked someone, and was told >>>>>>that Ed's example should be no different than using a function pointer or >>>>>>virtual functions. >>>>>> >>>>>>Ed talks about this method as if it is a good thing to use. So what is the >>>>>>advantage of it? Either someone is mistaken, or Ed and the guy I talked to are >>>>>>talking about different things.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.