Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: sliding attacks in three #define

Author: Christophe Theron
Date: 11:23:37 04/10/04
On April 10, 2004 at 12:51:03, Gerd Isenberg wrote:

>On April 10, 2004 at 09:43:09, Christophe Theron wrote:
>
>>On April 10, 2004 at 04:42:06, Gerd Isenberg wrote:
>>
><snip>
>>>During the "evolution" of my program from rotated to fill based, things changed
>>>a bit. I probably do a lot for "nothing" - but i do it only once, and i do it
>>>unconditionally and parallel with other tasks. Often with todays super pipelined
>>>processors, performing two or four independent, unconditional instructions
>>>chains doesn't matter so much as long as you have enough registers...
>>
>>
>>
>>I wouldn't be so sure...
>>
>
>I mean such pipeline miracles, using both float/mmx/xxm alus (mul/add) and other
>resources perfectly:
>
>-----------------------------------------------------------------------------
>Software Optimization
>Guide for AMD Athlon™ 64 and
>AMD Opteron™ Processors
>
>Chapter 9 Optimizing with SIMD Instructions page 227/228
>
>....
>
>Multiplying four complex single-precision numbers only takes 17 cycles as
>opposed to 14 cycles to multiply one complex single-precision number. The
>floating-point pipes are kept busy by feeding new instructions into the
>floating-point pipeline each cycle. In the arrangement above, 24 floating-point
>operations are performed in 17 cycles, achieving more than a 3.5x increase in
>performance.
>-----------------------------------------------------------------------------
>
>Of course an extrem case, but i have already made some experience with mmx fill
>stuff, doing up to four directions in parallel...
>
>>
>>
>>
>>>One other example i have in mind with my future 64-bit approach is using pairs
>>>of bitboards to generate white as well as black sliding attacks at once with
>>>128-bit xmm registers.
>>
>>
>>
>>I think there is even less use for attack tables for both sides than for attack
>>tables of the side to move.
>
>Ok, with pure pseudo legal move generation for sure. But for "more accurate"
>pruning/reduction/extension decisions, lazy eval decision, eval stuff, stalemate
>detection at the leaves, and SEE-like move sorting, etc.?
>
>I will give it a try, all attacks, all sliders including king as metaqueen,
>disjoint directionwise for each sliding piece (piecekind), and combined
>piecekind-  and directionwise, pinned pieces, remove-checker, several taboo
>bitboards, hanging, en prise, check targets, legale, direction wise move
>targets[ply] for movegen bookholding... in about 300-500 cycles (And probably
>even faster with future cpu's, e.g. if xmm alus became 128bit wide as already
>mentioned in amd optimization guide).
>
>Ok, for lot of lazy eval cutoffs it's still too expensive.
>Otherwise it may help to avoid "wrong" cutoffs and smells some forced mate
>threat or other tactical stuff, not to mention hiding some prefetched hash read.
>What is more important?
>
>Maybe i learn that all this bitboard stuff is evil ;-)
>
>Cheers,
>Gerd



I would suggest to first design your selective algorithms and then optimize
them.

Using bitboards just because they can do great things is not, IMO, a good enough
reason.



    Christophe
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.