Author: Gian-Carlo Pascutto
Date: 01:54:18 05/12/01
Go up one level in this thread
On May 12, 2001 at 01:31:20, Robert Hyatt wrote: >We are comparing apples and oranges? BSR/BSF are not popcnt instructions in >Crafty. They are to handle FirstOne()/LastOne() operations. PopCnt() is >a different function altogether. I think Dan got confused somehow. There are indeed no BSR/BSF's in the original popcount. >I don't think your code will outrun mine for 0 1 or 2 bits set, which >is the major case in the places I use it. Actually it will. When I tested it in Crafty it made it 1% faster _overall_, which indicates a significant speedup in the popcount function alone, for real positions. That was on an AMD Athlon machine. The reason why it is faster is that it does not have _any_ branches, whereas your code has several test/jump combinations. When I added an early-out test like your code uses it ran slower. So I guess the reason why it is faster is a) no mispredictions b) will fully pair in both MMX units c) can use extra MMX registers freeing the normal ones up The Pentium III has 2 MMX units just like the Athlon has, but a longer pipeline. So the speedup might be even bigger there. -- GCP
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.