Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Opteron Instruction Set

Author: Robert Hyatt

Date: 13:18:46 02/03/04

Go up one level in this thread


On February 03, 2004 at 15:49:27, Vincent Diepeveen wrote:

>On February 03, 2004 at 12:15:23, Robert Hyatt wrote:
>
>>On February 03, 2004 at 11:45:20, Vincent Diepeveen wrote:
>>
>>>On February 03, 2004 at 03:13:29, Gerd Isenberg wrote:
>>>
>>>>On February 03, 2004 at 01:03:29, Jay Urbanski wrote:
>>>>
>>>>>On February 02, 2004 at 22:41:19, Robert Hyatt wrote:
>>>>>
>>>>>>On February 02, 2004 at 20:06:29, David Rasmussen wrote:
>>>>>>
>>>>>>>Does the Opteron have firstBit, lastBit and popCount instructions? Or at least
>>>>>>>something that makes calculating them easier than on x86-32?
>>>>>>>
>>>>>>>/David
>>>>>>
>>>>>>
>>>>>>Has the same BSF/BSR instructions, but no popcnt that I have found.  Note
>>>>>>that BSF/BSR work on 64 bit values if you want.  I have inline asm to do
>>>>>>all three for gcc if you are interested.
>>>>>
>>>>>I understand there is a popcount instruction.  I also understand it's
>>>>>undocumented.
>>>>
>>>>Do you have any opcode or further hints?
>>>>That would be great - a 4 cycle vector path popcount ;-)
>>>
>>>And deadslow.
>>
>>
>>Certainly not slower than what we have to do at present...
>
>Yes it is slower, because no one ever thought of it in bitboards to write such
>stuff incremental.
>
>No popcnt's needed then.
>
>In fact majority of crafty's simple eval you can write incremental and it's way
>way faster.

No it isn't.  Again, why don't you look first and understand, and _then_ make
make comments with some information to back you up?

Look at where I use PopCnt().  Hint:  It is _not_ for computing mobility.  That
is a simple table lookup for me with no popcnt needed.


>
>Note that i'm not doing evaluation incremental (some datastructures i do) in
>DIEP, because i am busy making a huge great evaluation function.
>
>Readability and portability above anything else!
>
>mixing 8 unsigned bits arrays such as several solutions for BSF/BSR replacements
>at opteron use with signed ints with unsigned long long mixed with unsigned int.
>I find it all very bad to do.

Opteron needs one instruction.  I don't know what you are talking about..

>
>It's trivial that i could get diep easily 10% faster at opteron by rewriting all
>'int' arrays to 'unsigned int'. In that case at several spots in the program
>unsigned int gets mixed with signed.
>
>I find that detestable however.
>
>The same logics applies here to doing entire evaluation incremental in
>bitboards. You can throw away most of your bitboard logics of course as
>incremental stuff goes faster in non bitboards, but with or without bitboards,
>doing it incremental is *way* faster and you can avoid expensive stuff like pop
>counts.

Why don't you look where they are done.  Again, hint:  "behind the pawn hashing"
so there is no cost to speak of.  Look first, comment last.  Not vice versa.

BTW population counts are _not_ "way expensive".  It doesn't even show up on
profiling, generally...  last time it was here:

  0.11     88.79     0.10   359435     0.00     0.00  HistoryRefutation
  0.10     88.88     0.09   228891     0.00     0.00  InterposeSquares
  0.09     88.96     0.08                             PopCnt


That is a "whopping" .09%.  And yes I mean .09% _not_ 9%.

Do you _ever_ get anything right nowadays???




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.