Author: Robert Hyatt
Date: 21:17:38 07/29/03
Go up one level in this thread
On July 30, 2003 at 00:09:58, Keith Evans wrote: >On July 29, 2003 at 23:56:03, Robert Hyatt wrote: > >>On July 29, 2003 at 23:33:57, Keith Evans wrote: >> >>>On July 29, 2003 at 22:44:28, Vincent Diepeveen wrote: >>> >>>>On July 29, 2003 at 21:43:11, Keith Evans wrote: >>>> >>>>>On July 29, 2003 at 20:41:37, Vincent Diepeveen wrote: >>>>> >>>>>>On July 29, 2003 at 18:18:31, Keith Evans wrote: >>>>>> >>>>>>>On July 29, 2003 at 17:35:01, Vincent Diepeveen wrote: >>>>>>> >>>>>>>>On July 29, 2003 at 17:14:52, Keith Evans wrote: >>>>>>>> >>>>>>>>>On July 29, 2003 at 17:04:44, Vincent Diepeveen wrote: >>>>>>>>> >>>>>>>>>>On July 29, 2003 at 16:13:19, Keith Evans wrote: >>>>>>>>>> >>>>>>>>>>>On July 29, 2003 at 16:00:20, Tord Romstad wrote: >>>>>>>>>>> >>>>>>>>>>>>On July 29, 2003 at 12:49:49, Keith Evans wrote: >>>>>>>>>>>> >>>>>>>>>>>>>You're perft performance seems pretty decent to me. >>>>>>>>>>>> >>>>>>>>>>>>Indeed. I just did a similar test with my own program on a Pentium 4 2.4 GHz. >>>>>>>>>>>>In the position after 1. e4 e5 2. d4 d5, my program generates 30 million moves >>>>>>>>>>>>per second. I guess I could speed it up somewhat, but I don't think I would >>>>>>>>>>>>come anywhere close to the speeds reported by Vincent and Angrim. >>>>>>>>>>>> >>>>>>>>>>>>My move genererator assigns all moves a move ordering score, and also >>>>>>>>>>>>determines which moves are checks. It generates legal moves only. >>>>>>>>>>>> >>>>>>>>>>>>But anyway, I don't understand why people spend so much time and energy on >>>>>>>>>>>>micro-optimising their move generators. Despite my slow movegen speed, my >>>>>>>>>>>>program spends only 1 or 2 percent of its time in the move generator. I >>>>>>>>>>>>guess most other programmers have similar numbers. >>>>>>>>>>>> >>>>>>>>>>>>Tord >>>>>>>>>>> >>>>>>>>>>>I'm personally interested in the performance of the move generator in a hardware >>>>>>>>>>>chess chip where it is a large percentage of the total cycles. If it were only >>>>>>>>>>>1-2% of the time then I wouldn't be interested. Of course a hardware move >>>>>>>>>>>generator can generate millions of NPS when only running at say 30 MHz, so it's >>>>>>>>>>>a totally different animal than a software generator running on a 3 GHz >>>>>>>>>>>processor. >>>>>>>>>> >>>>>>>>>>hardware doesn't work like that. you cannot store the moves. >>>>>>>>>> >>>>>>>>> >>>>>>>>>Huh? (Duh?) Where did I say that it pregenerates and stores the moves? Of course >>>>>>>>>it generates them incrementally. >>>>>>>> >>>>>>>>but i hope you realize how hard it is to order moves when all you have is 1 >>>>>>>>bound that gives how far the incremental generation is. >>>>>>>> >>>>>>>>but if you compare speeds. >>>>>>>> >>>>>>>>Say that each move costs 1 clock. that's 30 million moves a second at 30Mhz >>>>>>>>right? >>>>>>>> >>>>>>>>Brutus ran at 2002 WCC at something like 33Mhz. So that's 33 MLN a second. >>>>>>>> >>>>>>>>DIEP i generate way more than 33MLN a second at the 1.6Ghz K7 i had back then. >>>>>>>> >>>>>>>>At 2.127Ghz it is about 72MLN. this with slow RAM storage. It's probably >>>>>>>>relatively faster at a P4 generating moves because of the fast L1 cache there >>>>>>>>and everything runs within trace cache when doing a loop for a few millions of >>>>>>>>times. >>>>>>>> >>>>>>> >>>>>>>Can you do perft at 72 million NPS? (Actually traverse a tree?) If not then >>>>>>>you're quoting something different. You could use Chrilly's 7 cycle/node number >>>>>>>which should include everything to generate, make, and unmake moves. So at 33 >>>>>>>MHz that would be 4.7 MNPS. >>>>>> >>>>>>please do not compare perft with generating moves. >>>>>> >>>>>> >>>>>>perft is generating a NUMBER. not moves. >>>>>> >>>>>>do you understand? >>>>>> >>>>>>all you need is a number for perft. not moves. >>>>> >>>>>If you time perft then you get a node count and a time in seconds, therefore you >>>>>can get nodes per second out of it. If you want to measure correctness then you >>>>>can just compare the node count. If you want some measure of performance then >>>>>you obviously look at NPS. >>>>> >>>>>I think that this is obvious. Just search through the CCC archives and look for >>>>>people trying to optimize the NPS in perft. Not that I'm recommending optimizing >>>>>for that, but it's been discussed many times. >>>>> >>>>>My opinion is that if you're going to quote 72 million NPS in software and try >>>>>to compare that to a hardware implementation, then the easiest way is to compare >>>>>using a perft style test. If you're just setting up one position and generating >>>>>moves from that position repeatedly in a tight loop, then there's really no way >>>>>to compare that. Plus my belief is that the perft number is more representative >>>>>of the performance of the move generator under real conditions. >>>>> >>>>>Here's some perft output from crafty which I take to be the de facto perft >>>>>standard: >>>>> >>>>>Crafty v19.3 >>>>> >>>>>White(1): perft 5 >>>>>total moves=4865609 time=1.24 >>>>> >>>>>So to convert this to NPS 4,865,609 [nodes]/1.24 [seconds] = 3,923,878 >>>>>[nodes/second] >>>>> >>>>>Please explain why that is incorrect? What does diep get for this case? >>>>>73,000,000 NPS? I find that hard to believe. >>>> >>>>you are making a mistake. >>>> >>>>you still do not understand what perft is. perft is a number. you can get it >>>>faster by adding hashtables and such. it has nothing to do with generating semi >>>>legal or legal moves at all. it is just creating a number faster. and it's not >>>>only hashtables. ever thought of how you can speed up perft incredible by using >>>>an incremental attacktable? >>>> >>>>There is no need to generate *any* legal move at all for perft. >>>> >>>>If i use a fast 8 bits attacktable for it at a K7 then even without hashtable >>>>you can get already like tens of millions of nodes a second easily. >>>> >>>>that number divided by the time is your perft nps number. >>>> >>>>it has nothing to do with GENERATION. you just must smartly count how many there >>>>are *period*. >>>> >>>>generating a few millions of times the same move list in a position X is quite >>>>different from that. >>>> >>>>So the rest of the story written here i will not even read. Sorry. Bedtime here >>>>:) >>> >>>If you don't want to answer the question, then just say so. No need to dance >>>around it... >>> >>>I would like to see a perft of yours that doesn't generate any moves, and which >>>can work with any arbitrary position to an arbitrary depth. If you want to use >>>attacktables then that's fine, you will have to maintain those tables and that >>>will take time. I personally think that using hashtables is going against the >>>spirit of the test, but there are always people that use "creative" techniques >>>to appear faster at benchmarks. >>> >>>If we can't agree on something so basic, then you're right that it's pointless >>>to continue the discussion. >>> >>>-K >> >> >>You finally reached the ultimate conclusion. :) >> >>But to re-iterate another point, perft is a correctness benchmark _only_, as it >>is defined. It is _not_ a performance benchmark. If it were, most of us would >>write it differently, certainly using hashing to avoid traversing duplicate >>parts of the tree searched due to transpositions. >> > >You're the one that prints out how many seconds that it takes ;-) Yes, but for a _different_ reason. If someone wants a perft 8 for some position, and a perft 6 takes 400 seconds, I can take a wild guess that I am not going to wait for 8. :) > >I don't know of anything else that could potentially be used as a move generator >benchmark. Not that one is desperately needed or anything, but I was hoping to >compare say Chrilly's hardware performance with some software implementations. >If you play by the same unwritten rules then it seems like perft is the closest >thing that we have to a benchmark that can measure effective legal move >generation rates in the context of making and unmaking moves. It would not be hard to write one. Let's pick a set of positions, alternating with black and white on move. Generate the moves for the first, then the second, etc. Do this N times. Time how long it takes. That is all move generator, without letting the board and everything just cycle through cache over and over with nearly perfect branch prediction, etc. > >This is potentially an interesting figure because for a hardware based engine >you shouldn't need to add too many additional cycles per node for evaluation if >you parallelize the hell out of the evaluation. So the move generator probably >takes 50% of the time. Especially if it's generating pseudo-legal moves and >wasting some cycles like the Belle generator. (Not that this is a bad thing, >just something that needs to be remembered when counting up the cycles per >node.) I agree. For software it is totally irrelevant. For hardware, things change. > >-K
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.