Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Implementation of the abs() function [o.t.]

Author: Dieter Buerssner

Date: 14:54:50 07/06/03

Go up one level in this thread


On July 06, 2003 at 15:35:34, Gerd Isenberg wrote:

>But i guess for some strange reasons this speedup only occurs in this loop.
>May be due to some pipelength or microcode alignment reason some internal
>hyperthreading like unrolling occurs. Using all pipes perfectly, processing two
>loop bodies simultaniously with "different" register sets?

This gave me to the following idea. Try a loop of rand(), and add a variable
number of noops (I used xor eax, eax, 1 byte. One could try other variations).
The following source is rather boring, but results look interesting:

MSVC  -Ox2 -Ob2 -G6 -Gr -GF
     randnoop0 13.208
     randnoop1 11.978
     randnoop2 12.197
     randnoop3 12.508
     randnoop4 13.009
     randnoop5 12.428
     randnoop6 12.458
     randnoop7 11.376
     randnoop8 11.737
     randnoop9 11.737
    randnoop10 8.792
    randnoop11 9.554
    randnoop12 9.464
    randnoop13 9.083
    randnoop14 9.474
    randnoop15 10.024
    randnoop16 8.342
    randnoop17 8.272

randnoop17 (calling rand and doing 17 xor eax,eax) is almost twice as fast, as
just calling rand. (Still not the 7.x seconds, that was used by the "overhead"
of omid_abs and calculating the sum). I checked the assembly fast, and
everything looks normal, and comparable.

The MSVC library rand is more or less (a linear congruential pseudo random
number generator, with a power of 2 modulus - this typically makes this sort of
PRNG rather bad. The deficience is made up a bit with the right shift):

  return ((state = state * CONST_1 + CONST_2) >> 16) & 0x7fff;

state is a static variable. CONST_1 is 214013, so no fast optimization of the
multiplication by shift/lea tricks is possible.

Regards,
Dieter




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.