Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Question to Bob: Crafty , Alpha and FindBit()

Author: Robert Hyatt

Date: 08:33:50 06/08/98

Go up one level in this thread


On June 08, 1998 at 11:18:54, Vincent Diepeveen wrote:

>On June 08, 1998 at 11:01:11, Robert Hyatt wrote:
>
>>On June 08, 1998 at 10:34:23, Vincent Diepeveen wrote:
>>
>>>On June 08, 1998 at 07:37:44, Robert Hyatt wrote:
>>>
>>>>On June 08, 1998 at 07:17:08, Vincent Diepeveen wrote:
>>>>
>>>>>
>>>>>On June 05, 1998 at 21:48:57, Robert Hyatt wrote:
>>>>>
>>>>>>On June 05, 1998 at 17:47:07, Bruce Moreland wrote:
>>>>>>
>>>>>>>
>>>>>>>On June 05, 1998 at 08:53:53, Ernst A. Heinz wrote:
>>>>>>>
>>>>>>>>BTW, we have been waiting to get our hands on a Alpha-21264 for roughly
>>>>>>>>a year because "DarkThought" will hopefully speed up as much on them
>>>>>>>>as "Crafty".
>>>>>>>
>>>>>>>What percentage of execution time do you and Bob spend doing these
>>>>>>>operations, do you think?netscape -install
>>>
>>>>>>>
>>>>>>>bruce
>>>>>>
>>>>>>FindFirstOne() = 1-2% in crafty.  popcnt is much less...  so that won't
>>>>>>help a whole lot.  but the 64 bit stuff helps in general...
>>>>>
>>>>>Well that doesn't say anything concrete.
>>>>>
>>>>>More important is the overall delay.
>>>>>
>>>>>At a P133 intel the speed difference between crafty's move generator
>>>>>and diep's move generator is somewhat more than 3 times.
>>>>>
>>>>>Crafty is more than 3 times slower, because of bitboards, which perform
>>>>>badly at 32 bits. Note that the more possibilities, this 3 times will
>>>>>become
>>>>>4 times.
>>>>>
>>>>>For something taking 1 clockcycle in my program and which can
>>>>>be even pipelined in U + V pipe,
>>>>>crafty needs sometimes even a whole function.
>>>>>
>>>>>Greetings,
>>>>>Vincent
>>>>
>>>>
>>>>yes... but try to generate nothing but captures and compare that, which
>>>>is a major part of the total tree search...
>>>
>>>Where do i need to generate captures for?
>>>
>>>But to answer your question: 2.5 times faster.
>>
>>
>>Because in ordering alpha/beta moves, captures are searched first.  And
>
>this statement is already naive.
>
>I search a move that gives a cutoff. Sometimes a capture, sometimes a
>check,
>sometimes a promotion, depending on what move is gonna give me the
>cutoff.
>
>I think we directly here see why Diep's branching factor is that well.
>I don't have a special order. I pick simply the move that is very
>probably gonna give me a cutoff.
>
>>they cause most of the cutoffs.  If you just produce captures, you don't
>>have to continually wade through the non-captures picking out capture
>>moves to try...
>>
>>and the best test to try is to set up a position in Crafty, and use the
>>"perf" command...  set up a position such as kopec22 which is a
>>reasonable
>>early middlegame position, and type "perf".  On a single processor
>>pentium
>>pro, that produces the following:
>
>We cannot compare our processors. I have a P133 at the same speed,
>but i don't have a Pentium pro SDRAM with 512 kb cache at a fast
>mainboard,
>like you have.
>

I don't have SDRAM nor 512KB cache.  My ALR uses FPM memory and each
cpu has 256K of cache... just like yours probably... except your memory
is probably a bit faster (EDO) vs my fast page mode memory.  But my
Orion
chipset for 4 processors only supports FPM for interleaving...

>I have a P133 laptop with a processor which is exactly 133Mhz,
>and i have a pro 200Mhz 256kb cache with edoram, which is way slower
>than
>your pro.

Wrong, as I said... Yours is actually a little faster, as my old Gateway
with EDO ram was a couple of percent faster than my current machine...


>  My datastructure doesn't fit within 256kb cache, except when
>i'm
>just generating moves... ...so that'll be a fair compare except for the
>SDRAM.
>I'm not sure what influence it has on just generating moves over and
>over again.
>
>>Black(1): perf
>>generated 3300000 moves, time=1.38 seconds
>>generated 2391304 moves per second
>>generated/made/unmade 3300000 moves, time=4.94 seconds
>>generated/made/unmade 668016 moves per second
>
>We must not forget that if i generate moves (3 times faster), that
>i store an entire structure for every move, where you only do
>
>*move++ = int;
>
>We should take this into account.

no reason...  You said "you can generate moves 3x faster".  I gave you
a number of moves I can generate per second.  If you can't generate them
that fast, then you aren't 3x faster.  I do things you don't do... you
do things I don't do.  But we both generate moves...




>
>I'm storing a lot of different stuff with a move which gets used in the
>move ordering and evaluation.
>
>I'm now here under UNIX at a slow HP9000 model 712 (60Mhz).
>When i'm at home i do this test special for you, where i shall replace
>that
>storing of datastructure.
>
>>IE I can generate all moves from that position for white, at 2.4M moves
>>per second, or I can generate all the moves, and then make each one, and
>>do that at 670K moves per second...
>
>>What's your speed?  here is the position, diagram and FEN:
>
>We should compare branching factor, that would be nice.
>You may use your 512MB SDRAM 4 processor against my pro200,
>also nice would be to do this generating of moves at a K6 or P133, which
>clearly will show that factor 3 to 4.


you keep saying that... but you are wrong.  Again, no SDRAM, which
wouldn't
matter anyway until someone gets a BX motherboard that can use the
faster
SDRAM cycle time for the second through fourth words of memory.  And I
don't
have 512K of cache... they were too expensive when I bought this
machine.




>
>>Black(1): d
>>
>>       +---+---+---+---+---+---+---+---+
>>    8  |   |   | *R|   |   | *R| *K|   |
>>       +---+---+---+---+---+---+---+---+
>>    7  |   | *B| *Q| *N| *B| *P| *P|   |
>>       +---+---+---+---+---+---+---+---+
>>    6  |   | *P|   | *P| *P| *N|   | *P|
>>       +---+---+---+---+---+---+---+---+
>>    5  | *P| P |   |   |   |   |   |   |
>>       +---+---+---+---+---+---+---+---+
>>    4  | N |   | P |   | P |   |   |   |
>>       +---+---+---+---+---+---+---+---+
>>    3  | P |   |   | B |   | N |   | P |
>>       +---+---+---+---+---+---+---+---+
>>    2  |   | B |   |   | Q | P | P |   |
>>       +---+---+---+---+---+---+---+---+
>>    1  | R |   |   | R |   |   | K |   |
>>       +---+---+---+---+---+---+---+---+
>>         a   b   c   d   e   f   g   h
>>
>
>Cool this has a lot of possibilities. That'll go with many many million
>moves
>a second!
>

let's see your numbers.  But you can't take anything out.  Otherwise
I have a lot of stuff I'll take out too, particularly in MakeMove()
which
is the second set of numbers above...  And use your pentium pro since it
is
only a touch faster than mine, then we can compare real numbers...



>>2r2rk/1bqnbpp/1p1ppn1p/pP/N1P1P/P2B1N1P/1B2QPP/R2R2K b



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.