Author: Robert Hyatt
Date: 15:30:07 02/03/04
Go up one level in this thread
On February 03, 2004 at 16:18:46, Robert Hyatt wrote: >On February 03, 2004 at 15:49:27, Vincent Diepeveen wrote: > >>On February 03, 2004 at 12:15:23, Robert Hyatt wrote: >> >>>On February 03, 2004 at 11:45:20, Vincent Diepeveen wrote: >>> >>>>On February 03, 2004 at 03:13:29, Gerd Isenberg wrote: >>>> >>>>>On February 03, 2004 at 01:03:29, Jay Urbanski wrote: >>>>> >>>>>>On February 02, 2004 at 22:41:19, Robert Hyatt wrote: >>>>>> >>>>>>>On February 02, 2004 at 20:06:29, David Rasmussen wrote: >>>>>>> >>>>>>>>Does the Opteron have firstBit, lastBit and popCount instructions? Or at least >>>>>>>>something that makes calculating them easier than on x86-32? >>>>>>>> >>>>>>>>/David >>>>>>> >>>>>>> >>>>>>>Has the same BSF/BSR instructions, but no popcnt that I have found. Note >>>>>>>that BSF/BSR work on 64 bit values if you want. I have inline asm to do >>>>>>>all three for gcc if you are interested. >>>>>> >>>>>>I understand there is a popcount instruction. I also understand it's >>>>>>undocumented. >>>>> >>>>>Do you have any opcode or further hints? >>>>>That would be great - a 4 cycle vector path popcount ;-) >>>> >>>>And deadslow. >>> >>> >>>Certainly not slower than what we have to do at present... >> >>Yes it is slower, because no one ever thought of it in bitboards to write such >>stuff incremental. >> >>No popcnt's needed then. >> >>In fact majority of crafty's simple eval you can write incremental and it's way >>way faster. > >No it isn't. Again, why don't you look first and understand, and _then_ make >make comments with some information to back you up? > >Look at where I use PopCnt(). Hint: It is _not_ for computing mobility. That >is a simple table lookup for me with no popcnt needed. > > >> >>Note that i'm not doing evaluation incremental (some datastructures i do) in >>DIEP, because i am busy making a huge great evaluation function. >> >>Readability and portability above anything else! >> >>mixing 8 unsigned bits arrays such as several solutions for BSF/BSR replacements >>at opteron use with signed ints with unsigned long long mixed with unsigned int. >>I find it all very bad to do. > >Opteron needs one instruction. I don't know what you are talking about.. > >> >>It's trivial that i could get diep easily 10% faster at opteron by rewriting all >>'int' arrays to 'unsigned int'. In that case at several spots in the program >>unsigned int gets mixed with signed. >> >>I find that detestable however. >> >>The same logics applies here to doing entire evaluation incremental in >>bitboards. You can throw away most of your bitboard logics of course as >>incremental stuff goes faster in non bitboards, but with or without bitboards, >>doing it incremental is *way* faster and you can avoid expensive stuff like pop >>counts. > >Why don't you look where they are done. Again, hint: "behind the pawn hashing" >so there is no cost to speak of. Look first, comment last. Not vice versa. > >BTW population counts are _not_ "way expensive". It doesn't even show up on >profiling, generally... last time it was here: > > 0.11 88.79 0.10 359435 0.00 0.00 HistoryRefutation > 0.10 88.88 0.09 228891 0.00 0.00 InterposeSquares > 0.09 88.96 0.08 PopCnt > > >That is a "whopping" .09%. And yes I mean .09% _not_ 9%. > >Do you _ever_ get anything right nowadays??? Aw rats. My math was bad. I should have used Diepeveen math. Then reducing the time spent in PopCnt() could double my NPS and get me 2-3 more plies of search. I keep forgetting that if I totally eliminate that .09% overhead, my NPS will double. Regardless of what my calculator says...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.