Author: Tim Foden
Date: 05:58:33 04/20/03
Go up one level in this thread
On April 19, 2003 at 07:52:43, Uri Blass wrote: >On April 19, 2003 at 06:35:42, Tony Werten wrote: > >>On April 19, 2003 at 05:08:40, Uri Blass wrote: >> >>>On April 19, 2003 at 04:54:29, Tony Werten wrote: >>> >>>>On April 19, 2003 at 04:07:37, Uri Blass wrote: >>>> >>>>>On April 19, 2003 at 03:37:06, Tony Werten wrote: >>>>> >>>>>>On April 17, 2003 at 20:35:35, James Robertson wrote: >>>>>> >>>>>>>Out of curiosity I tested just the move generation and basic board functions of >>>>>>>my bitboard chess program on several different computers. My home computer is a >>>>>>>Pentium 933mhz, and the other computers I used were Athlons in the 1.6ghz range. >>>>>>> >>>>>>>My program's move generator runs at roughly the same speed on both systems. I >>>>>>>was surprised and tested using several different compilers (VC5, VC6, .NET, >>>>>>>gcc), under Windows and under Linux. To compare more easily, I wrote a simple >>>>>>>non-bitboard move generator and tested this on all of the machines. The speed >>>>>>>differences scaled with the speed of the processors, which seemed logical. >>>>>>>However, I still cannot explain why the bitboard functions are so much slower on >>>>>>>the faster computers. The only difference I can see is that my home computer is >>>>>>>a pentium and the others are athlons. >>>>>>> >>>>>>>It seems strange that this would make such a large difference. Can anyone give >>>>>>>any reasons why? I used no assembly, just C/C++ code, with all the default >>>>>>>compiler options on all tests. >>>>>> >>>>>>I think it's because bitboards tend to fill up the caches, so memory becomes the >>>>>>bottleneck. >>>>>> >>>>>>With other approaches this doesn't happen, ( until you add the big stuff like >>>>>>eval ) so all things stay in chache wich makes it (almost) only processor >>>>>>limited. >>>>>> >>>>>>Tony >>>>> >>>>>Does it mean that bitboard chess programs or programs with big evaluation are >>>>>basically optimized for the old hardware because they cannot get much from new >>>>>hardware? >>>> >>>>No, because you're doing more then, wich can go faster with faster hardware. ie >>>>You don't reach the move generation bottleneck, but have others. >>>> >>>>But if all you are doing, has a memory bottleneck (wich happens when only >>>>generating moves with bitboards ), a faster cpu won't help very much. >>>> >>>>Tony >>> >>>The question is how can I identify memory bottleneck. >>> >>>I use bitboards only for pawn structure but I have a big move generator(some >>>thousands of C lines). >> >>I don't think the amount of code counts. It's the amount of datastructures it >>operates on. > >In that case I do not understand why bitboard cause a problems. > >How much data structure do you have with bitboards. > >some 64 bit varaible and some arrays does not seem to me much relative to the >hash tables that is the biggest data structure that programs use. > >If size is not the amount of data structure then how do you calculate that >amount? In Green Light the main lookup tables required for the rotated bitboard implementation use 512KB of memory. UINT64 m_rankMoves[64][256]; // 128K pos/mask UINT64 m_fileMoves[64][256]; // 128K pos/mask UINT64 m_diag1Moves[64][256]; // 128K pos/mask UINT64 m_diag2Moves[64][256]; // 128K pos/mask it is certainly possible to reduce this, if you allow the code that generates moves to do more lookups in smaller tables, and some extra logic operations. I have another program I am developing, that also uses the same rotated bitboards as Green Light, but only uses these tables: BYTE s_patMoves[8][256]; // 2.00K [R/F][pattern] UINT64 s_extendVert[256]; // 2.00K [pattern] UINT64 s_extendHorz[256]; // 2.00K [pattern] const UINT64 s_rankMask[8]; // 0.06K [file] const UINT64 s_fileMask[8]; // 0.06K [rank] const UINT64 s_diagAMask[15]; // 0.12K [diagA] const UINT64 s_diagHMask[15]; // 0.12K [diagH] It currently generates moves at about 90% of the speed of Green Light. I have yet to measure whether the better cache performance would make a difference to the chess program overall (so far this new program only generates moves and performs perft). Green Light movegen perf: Generate moves: Iterations: 1000000 Time: 0.751 Iterations/sec: 1331557.9 Moves/sec: 26631158.5 Generate, make, unmake moves: Iterations: 1000000 Time: 3.545 Iterations/sec: 282087.4 Moves/sec: 5641748.9 New program movegen perf: Generate moves: Iterations: 1000000 Time: 0.841 Iterations/sec: 1189060.6 Moves/sec: 23781212.8 Generate/make/undo moves: Iterations: 1000000 Time: 5.608 Iterations/sec: 178316.7 Moves/sec: 3566333.8 Cheers, Tim.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.