Computer Chess Club Archives




Subject: Re: Draw Detection by Move Repetition Procedure -- Comments

Author: Gerd Isenberg

Date: 01:01:50 08/02/04

Go up one level in this thread

>Hi Gerd,
>Thanks for the very useful comments.
>The exceptions you mentioned are important but very rare (<1% in practical
>play). I have already thought on similar problems before. The fact is that the
>most of the draw-repetition positions are generated by the check sequences with
>chains consist of even number of moves, or by moving different pieces. But,
>anyway, we will try to find elegant solution for those exceptions
>Your details about the LOOP command are very interesting, I think I have written
>thousands of LOOP commands in Axon!
>best regards,
>(Axon programmer)

Hi Vladan,

yes, for qsearch one may safely ignore such rare position repetitions.
They are not likely to happen in perpetual checks.

The problem with Athlon's LOOP instruction is that it is implemented as vector
path instruction blocking other otherwise parallel available resources.

LOOP disp8          E2h VectorPath 8
DEC CX/ECX          49h DirectPath 1
JNE/JNZ disp8       75h DirectPath 1

LOOP disp8          E2h VectorPath 9/8
The first latency value (9!) is for 32-bit mode.
The second is for 64-bit mode.

DEC ECX             49h DirectPath 1
JNZ/JNE short disp8 75h DirectPath 1

Another possible optimization is about rep stosw.

Software Optimization Guide
for AMD Athlon™ 64 and AMD Opteron™ Processors

8.3 Repeated String Instructions


Avoid using the REP prefix when performing string operations, especially when
copying blocks of memory.

In general, using the REP prefix to repeatedly perform string instructions is
less optimal than other methods, especially when copying blocks of memory. For a
discussion of alternate memory-copy methods, see “Appropriate Memory Copying
Routines” on page 112.

Inline REP String with Low Counts
If the repeat count is constant and low (less than eight), expand REP string
instructions into equivalent sequences of simple AMD64 instructions. Use an
inline sequence of loads and stores to accomplish the move. Use a sequence of
stores to emulate rep stos. This technique eliminates the setup overhead of REP
instructions and increases instruction throughput.


E.g. one may use six mmx-stores to zero 48 bytes (should be 8-byte aligned):

lea    eax, [chain_list]
pxor   mm0, mm0 ; zero
movq   [eax+0*8], mm0
movq   [eax+1*8], mm0
movq   [eax+2*8], mm0
movq   [eax+3*8], mm0
movq   [eax+4*8], mm0
movq   [eax+5*8], mm0


This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.