Author: Sune Fischer
Date: 00:42:49 08/18/02
Go up one level in this thread
On August 17, 2002 at 23:25:25, Vincent Diepeveen wrote: >On August 17, 2002 at 14:12:03, Gerd Isenberg wrote: > >>Hi Vincent, >> >>actually it's faster on my AthlonXP with USE_LOOKUP_XYZ defined!? >>Strange, i also would bet that the one without lookup is faster, >>which only needs 4 instructions by your definition. I don't understand this >>CISC-processors. > >Of course the instructions only outperform a table lookup when >the processors L1 and L2 caches are overloaded busy with storing >hashtable entries, doing a big evaluation and other stuff and not having >any time to just put this lookup table in L1 cache. Do you really think it is faster? You need aditional off-set calculation (not visual to the C-code) when doing this lookup, right? I have a hard time believing a table lookup can beat a naked 4 clock operation, even fetching things from the L2 catch is 5 clocks (IIRC that was the number Eugene mentioned once), so it has to be in L1 cache to even stand a chance. >Only testing the code versus the table is not a good idea obviously. Right. If that thing has to be in L1, it must mean that something else has to get out. >DIEP doesn't fit in L2 cache at all. I don't need to mention what is >faster for me :) > >>Tested in wrong bishop endings and KBN-K (without ETBs), where these inlines are >>used quite often in eval and recognizers. In KBN-K 660KNodes versus 656KNodes. >>But i don't played with optimizations so far (MSC++ minimize size optimization). > >see above. this is a typical case where the thing can put many relevant >things in L1 cache. > >>May be it's because m_sUDR[a][b] is accessed frequently with quite equal "a" or >>"b" and therefore is mostly already in first level cache - or a lack of >>registers. But the code is definitely shorter with lookup. >> >>#ifdef USE_LOOKUP_XYZ >>inline BOOL sameSquareColor(int a, int b) {return (m_sUDR[a][b]^1) & 1;} >>inline BOOL oppoSquareColor(int a, int b) {return m_sUDR[a][b] & 1;} >>#else >>inline BOOL sameSquareColor(int a, int b) {return (((a^b)>>3)^(a^b)^1) & 1;} >>inline BOOL oppoSquareColor(int a, int b) {return (((a^b)>>3)^(a^b)) & 1;} >>#endif >> >>(((a^b)>>3)^(a^b)) & 1; >>2 xors because only one (a^b) is necessary >>1 shift >>1 and >>------ >>4 instructions > >ah great, you found a faster way :) Well sort of, he bugfixed my (untested) method ;) -S. >>see you, >>Gerd
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.