Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: The fastest popcount

Author: Gian-Carlo Pascutto

Date: 01:54:18 05/12/01

Go up one level in this thread


On May 12, 2001 at 01:31:20, Robert Hyatt wrote:

>We are comparing apples and oranges?  BSR/BSF are not popcnt instructions in
>Crafty.  They are to handle FirstOne()/LastOne() operations.  PopCnt() is
>a different function altogether.

I think Dan got confused somehow. There are indeed no BSR/BSF's in
the original popcount.

>I don't think your code will outrun mine for 0 1 or 2 bits set, which
>is the major case in the places I use it.

Actually it will. When I tested it in Crafty it made it 1% faster
_overall_, which indicates a significant speedup in the popcount
function alone, for real positions. That was on an AMD Athlon machine.

The reason why it is faster is that it does not have _any_ branches,
whereas your code has several test/jump combinations.

When I added an early-out test like your code uses it ran slower.

So I guess the reason why it is faster is

a) no mispredictions
b) will fully pair in both MMX units
c) can use extra MMX registers freeing the normal ones up

The Pentium III has 2 MMX units just like the Athlon has, but a
longer pipeline. So the speedup might be even bigger there.

--
GCP



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.