Author: Eugene Nalimov
Date: 11:54:21 11/18/02
Go up one level in this thread
I am talking about Visual Studio .Net, that (among other things) includes native
VC7.0 compiler. That compiler knows about P4 (VC6.0 doesn't, as it was released
5 years ago), and it does not generate shift instructions when alternatives are
cheaper.
Yes, shifts are very common in C/C++ code, but now you probably understand that
good compiler often can get rid of them.
Thanks,
Eugene
On November 18, 2002 at 14:47:52, Vincent Diepeveen wrote:
>On November 18, 2002 at 12:15:09, Eugene Nalimov wrote:
>
>You talking about the NET visual c++ or something that only you
>have and no one else can use?
>
>The NET 7.0 no one can use because you can decompile it too well.
>You can of course buy a $$$$$ wrangler or whatever for that but that's
>not the meaning of how i compile stuff. I'll produce something with
>visual c++ 6.0 sp4 procpack instead, which is 2% faster for me anyway
>at the P3 and K7 than visual c++ 7.0 NET.
>
>Whatever happens the SHL/SHR/SAL instructions which are used a lot in
>all software that is compatible, they are *very* slow at the P4!
>
>Which in itself is weird.
>
>
>>I just compiled the C function using Visual C 7.0. There is no single "shift
>>left" instruction in the generated code.
>>
>>So next please count the inctructions produced by the compiler, not by you.
>>
>>Thanks,
>>Eugene
>>
>>On November 18, 2002 at 10:20:49, Vincent Diepeveen wrote:
>>
>>>On November 17, 2002 at 13:30:22, Eugene Nalimov wrote:
>>>
>>>6 compares times 0.5 cycles ==> 3 cycles
>>>5 shiftlefts times 2 cycles ==> 10 cycles
>>>
>>>So it is like 13 cycles to get a single bit out of a bitboard in
>>>this C code is very slow on the P4. 13 cycles is hell of a lot,
>>>i can generate several moves in non-bitboards within that time!
>>>
>>>>On November 17, 2002 at 13:25:55, Vincent Diepeveen wrote:
>>>>
>>>>>On November 17, 2002 at 12:01:02, Gerd Isenberg wrote:
>>>>>
>>>>>>On November 17, 2002 at 10:53:05, Joel wrote:
>>>>>>
>>>>>>>Hey All,
>>>>>>>
>>>>>>>I am a 2nd year Uni student from Australia who has recently gotten into chess
>>>>>>>programming. My first attempt was a simple array-based alpha-beta variant which
>>>>>>>struggled to search more than 6 levels deep in most positions! I think that
>>>>>>>might have something to do with the fact that there was no move ordering,
>>>>>>>transposition table, an expensive evaluation function, no killer moves and weak
>>>>>>>coding :)
>>>>>>>
>>>>>>>I have been working on my second attempt for some time now. It uses Bitboards. I
>>>>>>>have a few questions regarding move generation.
>>>>>>>
>>>>>>>It seems to me that the performance of the Bitboard approach relies somewhat
>>>>>>>heavily on how fast you can retrieve the position of a 1 bit within a 64-bit
>>>>>>>unsigned integer. I looked for sometime on the Internet for some kind of
>>>>>>>magical, hacky solution to this dilemna, and the best I could find was this (b &
>>>>>>>-b) trick which I used in a debatedly clever way. I was just wondering if there
>>>>>>>is any approach significantly better than the one which I will outline below:
>>>>>>>
>>>>>>>1. (b & -b) to clear all 1 bit's except for one.
>>>>>>>2. get this value, mod it by 67 (which has the property that every possible
>>>>>>> value returned is unique, thus i can hash to the position of the bit in the
>>>>>>> 64 bit integer.)
>>>>>>>
>>>>>>>I am no expert, but it doesn't seem too ineffecient to me. Any problems?
>>>>>>>
>>>>>>>Also, if there are any improvements, I would prefer to find out about the ones
>>>>>>>which do not involve assembly coding - I do not want to make my program too
>>>>>>>dependant on any one CPU architecture at this stage.
>>>>>>>
>>>>>>>Thanks for your time,
>>>>>>>Joel
>>>>>>
>>>>>>Hi Joel,
>>>>>>
>>>>>>nice idea with the mod 67 to get unique bitvalues. But i fear a 64 bit div/mod
>>>>>>operation is too slow, even with 64bit-processors. If you don't want to use
>>>>>>assembler eg. intels x86 bsf (bit scan foreward), i think using a lookup table
>>>>>>indexed by the byte or 16-bit word is most common.
>>>>>>
>>>>>>I use this approach:
>>>>>>
>>>>>>#define LOWBOARD(bb) (*((UINT32*)&(bb)))
>>>>>>#define HIGHBOARD(bb) (*(((UINT32*)&(bb))+1))
>>>>>>
>>>>>>// precondition: bb not null
>>>>>>__forceinline UINT BitSearch(BitBoard bb)
>>>>>>{
>>>>>> ASSERT(bb != 0);
>>>>>>#ifdef _M_IX86
>>>>>> __asm
>>>>>> {
>>>>>> bsf eax,[bb+4]
>>>>>> xor eax, 32
>>>>>> bsf eax,[bb]
>>>>>> }
>>>>>>#else
>>>>>> BitBoard lsbb = bb & (-(__int64)bb);
>>>>>> UINT32 lsb = LOWBOARD(lsbb) | HIGHBOARD(lsbb);
>>>>>> return ((((((((((HIGHBOARD(lsbb)!=0) <<1)
>>>>>> +((lsb & 0xffff0000)!=0))<<1)
>>>>>> +((lsb & 0xff00ff00)!=0))<<1)
>>>>>> +((lsb & 0xf0f0f0f0)!=0))<<1)
>>>>>> +((lsb & 0xcccccccc)!=0))<<1)
>>>>>> +((lsb & 0xaaaaaaaa)!=0);
>>>>>>#endif
>>>>>>}
>>>>>>
>>>>>>Regards,
>>>>>>Gerd
>>>>>
>>>>>Isn't shifting on the P4 *WAY* slower than his mod 67 idea is?
>>>>
>>>>No.
>>>>
>>>>Thanks,
>>>>Eugene
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.