Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: movegen speeds(was Re: Status of Brutus?)

Author: Robert Hyatt

Date: 20:56:03 07/29/03

Go up one level in this thread


On July 29, 2003 at 23:33:57, Keith Evans wrote:

>On July 29, 2003 at 22:44:28, Vincent Diepeveen wrote:
>
>>On July 29, 2003 at 21:43:11, Keith Evans wrote:
>>
>>>On July 29, 2003 at 20:41:37, Vincent Diepeveen wrote:
>>>
>>>>On July 29, 2003 at 18:18:31, Keith Evans wrote:
>>>>
>>>>>On July 29, 2003 at 17:35:01, Vincent Diepeveen wrote:
>>>>>
>>>>>>On July 29, 2003 at 17:14:52, Keith Evans wrote:
>>>>>>
>>>>>>>On July 29, 2003 at 17:04:44, Vincent Diepeveen wrote:
>>>>>>>
>>>>>>>>On July 29, 2003 at 16:13:19, Keith Evans wrote:
>>>>>>>>
>>>>>>>>>On July 29, 2003 at 16:00:20, Tord Romstad wrote:
>>>>>>>>>
>>>>>>>>>>On July 29, 2003 at 12:49:49, Keith Evans wrote:
>>>>>>>>>>
>>>>>>>>>>>You're perft performance seems pretty decent to me.
>>>>>>>>>>
>>>>>>>>>>Indeed.  I just did a similar test with my own program on a Pentium 4 2.4 GHz.
>>>>>>>>>>In the position after 1. e4 e5 2. d4 d5, my program generates 30 million moves
>>>>>>>>>>per second.  I guess I could speed it up somewhat, but I don't think I would
>>>>>>>>>>come anywhere close to the speeds reported by Vincent and Angrim.
>>>>>>>>>>
>>>>>>>>>>My move genererator assigns all moves a move ordering score, and also
>>>>>>>>>>determines which moves are checks.  It generates legal moves only.
>>>>>>>>>>
>>>>>>>>>>But anyway, I don't understand why people spend so much time and energy on
>>>>>>>>>>micro-optimising their move generators.  Despite my slow movegen speed, my
>>>>>>>>>>program spends only 1 or 2 percent of its time in the move generator.  I
>>>>>>>>>>guess most other programmers have similar numbers.
>>>>>>>>>>
>>>>>>>>>>Tord
>>>>>>>>>
>>>>>>>>>I'm personally interested in the performance of the move generator in a hardware
>>>>>>>>>chess chip where it is a large percentage of the total cycles. If it were only
>>>>>>>>>1-2% of the time then I wouldn't be interested. Of course a hardware move
>>>>>>>>>generator can generate millions of NPS when only running at say 30 MHz, so it's
>>>>>>>>>a totally different animal than a software generator running on a 3 GHz
>>>>>>>>>processor.
>>>>>>>>
>>>>>>>>hardware doesn't work like that. you cannot store the moves.
>>>>>>>>
>>>>>>>
>>>>>>>Huh? (Duh?) Where did I say that it pregenerates and stores the moves? Of course
>>>>>>>it generates them incrementally.
>>>>>>
>>>>>>but i hope you realize how hard it is to order moves when all you have is 1
>>>>>>bound that gives how far the incremental generation is.
>>>>>>
>>>>>>but if you compare speeds.
>>>>>>
>>>>>>Say that each move costs 1 clock. that's 30 million moves a second at 30Mhz
>>>>>>right?
>>>>>>
>>>>>>Brutus ran at 2002 WCC at something like 33Mhz. So that's 33 MLN a second.
>>>>>>
>>>>>>DIEP i generate way more than 33MLN a second at the 1.6Ghz K7 i had back then.
>>>>>>
>>>>>>At 2.127Ghz it is about 72MLN. this with slow RAM storage. It's probably
>>>>>>relatively faster at a P4 generating moves because of the fast L1 cache there
>>>>>>and everything runs within trace cache when doing a loop for a few millions of
>>>>>>times.
>>>>>>
>>>>>
>>>>>Can you do perft at 72 million NPS? (Actually traverse a tree?) If not then
>>>>>you're quoting something different. You could use Chrilly's 7 cycle/node number
>>>>>which should include everything to generate, make, and unmake moves. So at 33
>>>>>MHz that would be 4.7 MNPS.
>>>>
>>>>please do not compare perft with generating moves.
>>>>
>>>>
>>>>perft is generating a NUMBER. not moves.
>>>>
>>>>do you understand?
>>>>
>>>>all you need is a number for perft. not moves.
>>>
>>>If you time perft then you get a node count and a time in seconds, therefore you
>>>can get nodes per second out of it. If you want to measure correctness then you
>>>can just compare the node count. If you want some measure of performance then
>>>you obviously look at NPS.
>>>
>>>I think that this is obvious. Just search through the CCC archives and look for
>>>people trying to optimize the NPS in perft. Not that I'm recommending optimizing
>>>for that, but it's been discussed many times.
>>>
>>>My opinion is that if you're going to quote 72 million NPS in software and try
>>>to compare that to a hardware implementation, then the easiest way is to compare
>>>using a perft style test. If you're just setting up one position and generating
>>>moves from that position repeatedly in a tight loop, then there's really no way
>>>to compare that. Plus my belief is that the perft number is more representative
>>>of the performance of the move generator under real conditions.
>>>
>>>Here's some perft output from crafty which I take to be the de facto perft
>>>standard:
>>>
>>>Crafty v19.3
>>>
>>>White(1): perft 5
>>>total moves=4865609  time=1.24
>>>
>>>So to convert this to NPS 4,865,609 [nodes]/1.24 [seconds] = 3,923,878
>>>[nodes/second]
>>>
>>>Please explain why that is incorrect? What does diep get for this case?
>>>73,000,000 NPS? I find that hard to believe.
>>
>>you are making a mistake.
>>
>>you still do not understand what perft is. perft is a number. you can get it
>>faster by adding hashtables and such. it has nothing to do with generating semi
>>legal or legal moves at all. it is just creating a number faster. and it's not
>>only hashtables. ever thought of how you can speed up perft incredible by using
>>an incremental attacktable?
>>
>>There is no need to generate *any* legal move at all for perft.
>>
>>If i use a fast 8 bits attacktable for it at a K7 then even without hashtable
>>you can get already like tens of millions of nodes a second easily.
>>
>>that number divided by the time is your perft nps number.
>>
>>it has nothing to do with GENERATION. you just must smartly count how many there
>>are *period*.
>>
>>generating a few millions of times the same move list in a position X is quite
>>different from that.
>>
>>So the rest of the story written here i will not even read. Sorry. Bedtime here
>>:)
>
>If you don't want to answer the question, then just say so. No need to dance
>around it...
>
>I would like to see a perft of yours that doesn't generate any moves, and which
>can work with any arbitrary position to an arbitrary depth. If you want to use
>attacktables then that's fine, you will have to maintain those tables and that
>will take time. I personally think that using hashtables is going against the
>spirit of the test, but there are always people that use "creative" techniques
>to appear faster at benchmarks.
>
>If we can't agree on something so basic, then you're right that it's pointless
>to continue the discussion.
>
>-K


You finally reached the ultimate conclusion.  :)

But to re-iterate another point, perft is a correctness benchmark _only_, as it
is defined.  It is _not_ a performance benchmark.  If it were, most of us would
write it differently, certainly using hashing to avoid traversing duplicate
parts of the tree searched due to transpositions.

But we only care about the _size_ of the tree searched, not the speed it is
searched at...

Hence the definition of how to compute the size.




This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.