Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Bitboard engine speed difference on different processors

Author: Uri Blass

Date: 05:50:32 04/19/03

Go up one level in this thread


On April 19, 2003 at 08:13:34, Tony Werten wrote:

>On April 19, 2003 at 07:52:43, Uri Blass wrote:
>
>>On April 19, 2003 at 06:35:42, Tony Werten wrote:
>>
>>>On April 19, 2003 at 05:08:40, Uri Blass wrote:
>>>
>>>>On April 19, 2003 at 04:54:29, Tony Werten wrote:
>>>>
>>>>>On April 19, 2003 at 04:07:37, Uri Blass wrote:
>>>>>
>>>>>>On April 19, 2003 at 03:37:06, Tony Werten wrote:
>>>>>>
>>>>>>>On April 17, 2003 at 20:35:35, James Robertson wrote:
>>>>>>>
>>>>>>>>Out of curiosity I tested just the move generation and basic board functions of
>>>>>>>>my bitboard chess program on several different computers. My home computer is a
>>>>>>>>Pentium 933mhz, and the other computers I used were Athlons in the 1.6ghz range.
>>>>>>>>
>>>>>>>>My program's move generator runs at roughly the same speed on both systems. I
>>>>>>>>was surprised and tested using several different compilers (VC5, VC6, .NET,
>>>>>>>>gcc), under Windows and under Linux. To compare more easily, I wrote a simple
>>>>>>>>non-bitboard move generator and tested this on all of the machines. The speed
>>>>>>>>differences scaled with the speed of the processors, which seemed logical.
>>>>>>>>However, I still cannot explain why the bitboard functions are so much slower on
>>>>>>>>the faster computers. The only difference I can see is that my home computer is
>>>>>>>>a pentium and the others are athlons.
>>>>>>>>
>>>>>>>>It seems strange that this would make such a large difference. Can anyone give
>>>>>>>>any reasons why? I used no assembly, just C/C++ code, with all the default
>>>>>>>>compiler options on all tests.
>>>>>>>
>>>>>>>I think it's because bitboards tend to fill up the caches, so memory becomes the
>>>>>>>bottleneck.
>>>>>>>
>>>>>>>With other approaches this doesn't happen, ( until you add the big stuff like
>>>>>>>eval ) so all things stay in chache  wich makes it (almost) only processor
>>>>>>>limited.
>>>>>>>
>>>>>>>Tony
>>>>>>
>>>>>>Does it mean that bitboard chess programs or programs with big evaluation are
>>>>>>basically optimized for the old hardware because they cannot get much from new
>>>>>>hardware?
>>>>>
>>>>>No, because you're doing more then, wich can go faster with faster hardware. ie
>>>>>You don't reach the move generation bottleneck, but have others.
>>>>>
>>>>>But if all you are doing, has a memory bottleneck (wich happens when only
>>>>>generating moves with bitboards ), a faster cpu won't help very much.
>>>>>
>>>>>Tony
>>>>
>>>>The question is how can I identify memory bottleneck.
>>>>
>>>>I use bitboards only for pawn structure but I have a big move generator(some
>>>>thousands of C lines).
>>>
>>>I don't think the amount of code counts. It's the amount of datastructures it
>>>operates on.
>>
>>In that case I do not understand why bitboard cause a problems.
>>
>>How much data structure do you have with bitboards.
>
>1-2 MB depending on your implementation.

How do you get these numbers?

My pawn datastructure is less than a kbyte.
Most of it is
BitBoard setmask[64];

64*8 bytes is only 1/2 kbytes.

Most of my data structures are no bitboards.

My attack tables are clearly bigger than it but the biggest table of them is
only 64*16 array of integer that means 4 kbytes.

2 bigger array that I can find now are

char biggest_power[65536];
char smallest_power[65536];

They cover together only 32 kbytes.

The biggest array that I can find is
gen_t gen_dat[GEN_STACK];

GEN_STACK is 30000 and it includes 3 int so it is 90000*4 bytes that
is almost 360 kbytes.

I assume that there the number of moves in my gen_stack is smaller than 30,000
and the number of the move in the gen_stack is basically the number of
legal moves in the line that I search(the sum of perft 1 in all the positions
that appear in the line that the program searches)

In theory the program may crush if it is more than 30000 but I do not know about
a practical case that it happens because in complicated positions I do not
expect the program to search very deep and even if you assume that
every side has 50 legal moves in the line you need to search 60 plies forward to
get above 30000.

GEN_STACK was one of the constants that I copied it's name from tscp but I
understood that the constant in tscp is not going to be enough for me so I
increased it significantly.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.