Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: movegen speeds(was Re: Status of Brutus?)

Author: Robert Hyatt

Date: 21:17:38 07/29/03

Go up one level in this thread


On July 30, 2003 at 00:09:58, Keith Evans wrote:

>On July 29, 2003 at 23:56:03, Robert Hyatt wrote:
>
>>On July 29, 2003 at 23:33:57, Keith Evans wrote:
>>
>>>On July 29, 2003 at 22:44:28, Vincent Diepeveen wrote:
>>>
>>>>On July 29, 2003 at 21:43:11, Keith Evans wrote:
>>>>
>>>>>On July 29, 2003 at 20:41:37, Vincent Diepeveen wrote:
>>>>>
>>>>>>On July 29, 2003 at 18:18:31, Keith Evans wrote:
>>>>>>
>>>>>>>On July 29, 2003 at 17:35:01, Vincent Diepeveen wrote:
>>>>>>>
>>>>>>>>On July 29, 2003 at 17:14:52, Keith Evans wrote:
>>>>>>>>
>>>>>>>>>On July 29, 2003 at 17:04:44, Vincent Diepeveen wrote:
>>>>>>>>>
>>>>>>>>>>On July 29, 2003 at 16:13:19, Keith Evans wrote:
>>>>>>>>>>
>>>>>>>>>>>On July 29, 2003 at 16:00:20, Tord Romstad wrote:
>>>>>>>>>>>
>>>>>>>>>>>>On July 29, 2003 at 12:49:49, Keith Evans wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>You're perft performance seems pretty decent to me.
>>>>>>>>>>>>
>>>>>>>>>>>>Indeed.  I just did a similar test with my own program on a Pentium 4 2.4 GHz.
>>>>>>>>>>>>In the position after 1. e4 e5 2. d4 d5, my program generates 30 million moves
>>>>>>>>>>>>per second.  I guess I could speed it up somewhat, but I don't think I would
>>>>>>>>>>>>come anywhere close to the speeds reported by Vincent and Angrim.
>>>>>>>>>>>>
>>>>>>>>>>>>My move genererator assigns all moves a move ordering score, and also
>>>>>>>>>>>>determines which moves are checks.  It generates legal moves only.
>>>>>>>>>>>>
>>>>>>>>>>>>But anyway, I don't understand why people spend so much time and energy on
>>>>>>>>>>>>micro-optimising their move generators.  Despite my slow movegen speed, my
>>>>>>>>>>>>program spends only 1 or 2 percent of its time in the move generator.  I
>>>>>>>>>>>>guess most other programmers have similar numbers.
>>>>>>>>>>>>
>>>>>>>>>>>>Tord
>>>>>>>>>>>
>>>>>>>>>>>I'm personally interested in the performance of the move generator in a hardware
>>>>>>>>>>>chess chip where it is a large percentage of the total cycles. If it were only
>>>>>>>>>>>1-2% of the time then I wouldn't be interested. Of course a hardware move
>>>>>>>>>>>generator can generate millions of NPS when only running at say 30 MHz, so it's
>>>>>>>>>>>a totally different animal than a software generator running on a 3 GHz
>>>>>>>>>>>processor.
>>>>>>>>>>
>>>>>>>>>>hardware doesn't work like that. you cannot store the moves.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Huh? (Duh?) Where did I say that it pregenerates and stores the moves? Of course
>>>>>>>>>it generates them incrementally.
>>>>>>>>
>>>>>>>>but i hope you realize how hard it is to order moves when all you have is 1
>>>>>>>>bound that gives how far the incremental generation is.
>>>>>>>>
>>>>>>>>but if you compare speeds.
>>>>>>>>
>>>>>>>>Say that each move costs 1 clock. that's 30 million moves a second at 30Mhz
>>>>>>>>right?
>>>>>>>>
>>>>>>>>Brutus ran at 2002 WCC at something like 33Mhz. So that's 33 MLN a second.
>>>>>>>>
>>>>>>>>DIEP i generate way more than 33MLN a second at the 1.6Ghz K7 i had back then.
>>>>>>>>
>>>>>>>>At 2.127Ghz it is about 72MLN. this with slow RAM storage. It's probably
>>>>>>>>relatively faster at a P4 generating moves because of the fast L1 cache there
>>>>>>>>and everything runs within trace cache when doing a loop for a few millions of
>>>>>>>>times.
>>>>>>>>
>>>>>>>
>>>>>>>Can you do perft at 72 million NPS? (Actually traverse a tree?) If not then
>>>>>>>you're quoting something different. You could use Chrilly's 7 cycle/node number
>>>>>>>which should include everything to generate, make, and unmake moves. So at 33
>>>>>>>MHz that would be 4.7 MNPS.
>>>>>>
>>>>>>please do not compare perft with generating moves.
>>>>>>
>>>>>>
>>>>>>perft is generating a NUMBER. not moves.
>>>>>>
>>>>>>do you understand?
>>>>>>
>>>>>>all you need is a number for perft. not moves.
>>>>>
>>>>>If you time perft then you get a node count and a time in seconds, therefore you
>>>>>can get nodes per second out of it. If you want to measure correctness then you
>>>>>can just compare the node count. If you want some measure of performance then
>>>>>you obviously look at NPS.
>>>>>
>>>>>I think that this is obvious. Just search through the CCC archives and look for
>>>>>people trying to optimize the NPS in perft. Not that I'm recommending optimizing
>>>>>for that, but it's been discussed many times.
>>>>>
>>>>>My opinion is that if you're going to quote 72 million NPS in software and try
>>>>>to compare that to a hardware implementation, then the easiest way is to compare
>>>>>using a perft style test. If you're just setting up one position and generating
>>>>>moves from that position repeatedly in a tight loop, then there's really no way
>>>>>to compare that. Plus my belief is that the perft number is more representative
>>>>>of the performance of the move generator under real conditions.
>>>>>
>>>>>Here's some perft output from crafty which I take to be the de facto perft
>>>>>standard:
>>>>>
>>>>>Crafty v19.3
>>>>>
>>>>>White(1): perft 5
>>>>>total moves=4865609  time=1.24
>>>>>
>>>>>So to convert this to NPS 4,865,609 [nodes]/1.24 [seconds] = 3,923,878
>>>>>[nodes/second]
>>>>>
>>>>>Please explain why that is incorrect? What does diep get for this case?
>>>>>73,000,000 NPS? I find that hard to believe.
>>>>
>>>>you are making a mistake.
>>>>
>>>>you still do not understand what perft is. perft is a number. you can get it
>>>>faster by adding hashtables and such. it has nothing to do with generating semi
>>>>legal or legal moves at all. it is just creating a number faster. and it's not
>>>>only hashtables. ever thought of how you can speed up perft incredible by using
>>>>an incremental attacktable?
>>>>
>>>>There is no need to generate *any* legal move at all for perft.
>>>>
>>>>If i use a fast 8 bits attacktable for it at a K7 then even without hashtable
>>>>you can get already like tens of millions of nodes a second easily.
>>>>
>>>>that number divided by the time is your perft nps number.
>>>>
>>>>it has nothing to do with GENERATION. you just must smartly count how many there
>>>>are *period*.
>>>>
>>>>generating a few millions of times the same move list in a position X is quite
>>>>different from that.
>>>>
>>>>So the rest of the story written here i will not even read. Sorry. Bedtime here
>>>>:)
>>>
>>>If you don't want to answer the question, then just say so. No need to dance
>>>around it...
>>>
>>>I would like to see a perft of yours that doesn't generate any moves, and which
>>>can work with any arbitrary position to an arbitrary depth. If you want to use
>>>attacktables then that's fine, you will have to maintain those tables and that
>>>will take time. I personally think that using hashtables is going against the
>>>spirit of the test, but there are always people that use "creative" techniques
>>>to appear faster at benchmarks.
>>>
>>>If we can't agree on something so basic, then you're right that it's pointless
>>>to continue the discussion.
>>>
>>>-K
>>
>>
>>You finally reached the ultimate conclusion.  :)
>>
>>But to re-iterate another point, perft is a correctness benchmark _only_, as it
>>is defined.  It is _not_ a performance benchmark.  If it were, most of us would
>>write it differently, certainly using hashing to avoid traversing duplicate
>>parts of the tree searched due to transpositions.
>>
>
>You're the one that prints out how many seconds that it takes ;-)

Yes, but for a _different_ reason.

If someone wants a perft 8 for some position, and a perft 6 takes 400 seconds,
I can take a wild guess that I am not going to wait for 8.  :)


>
>I don't know of anything else that could potentially be used as a move generator
>benchmark. Not that one is desperately needed or anything, but I was hoping to
>compare say Chrilly's hardware performance with some software implementations.
>If you play by the same unwritten rules then it seems like perft is the closest
>thing that we have to a benchmark that can measure effective legal move
>generation rates in the context of making and unmaking moves.

It would not be hard to write one.  Let's pick a set of positions, alternating
with black and white on move.  Generate the moves for the first, then the
second, etc.  Do this N times.  Time how long it takes.  That is all move
generator, without letting the board and everything just cycle through cache
over and over with nearly perfect branch prediction, etc.






>
>This is potentially an interesting figure because for a hardware based engine
>you shouldn't need to add too many additional cycles per node for evaluation if
>you parallelize the hell out of the evaluation. So the move generator probably
>takes 50% of the time. Especially if it's generating pseudo-legal moves and
>wasting some cycles like the Belle generator. (Not that this is a bad thing,
>just something that needs to be remembered when counting up the cycles per
>node.)

I agree.  For software it is totally irrelevant.  For hardware, things
change.


>
>-K



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.