Author: Robert Hyatt
Date: 08:33:50 06/08/98
Go up one level in this thread
On June 08, 1998 at 11:18:54, Vincent Diepeveen wrote: >On June 08, 1998 at 11:01:11, Robert Hyatt wrote: > >>On June 08, 1998 at 10:34:23, Vincent Diepeveen wrote: >> >>>On June 08, 1998 at 07:37:44, Robert Hyatt wrote: >>> >>>>On June 08, 1998 at 07:17:08, Vincent Diepeveen wrote: >>>> >>>>> >>>>>On June 05, 1998 at 21:48:57, Robert Hyatt wrote: >>>>> >>>>>>On June 05, 1998 at 17:47:07, Bruce Moreland wrote: >>>>>> >>>>>>> >>>>>>>On June 05, 1998 at 08:53:53, Ernst A. Heinz wrote: >>>>>>> >>>>>>>>BTW, we have been waiting to get our hands on a Alpha-21264 for roughly >>>>>>>>a year because "DarkThought" will hopefully speed up as much on them >>>>>>>>as "Crafty". >>>>>>> >>>>>>>What percentage of execution time do you and Bob spend doing these >>>>>>>operations, do you think?netscape -install >>> >>>>>>> >>>>>>>bruce >>>>>> >>>>>>FindFirstOne() = 1-2% in crafty. popcnt is much less... so that won't >>>>>>help a whole lot. but the 64 bit stuff helps in general... >>>>> >>>>>Well that doesn't say anything concrete. >>>>> >>>>>More important is the overall delay. >>>>> >>>>>At a P133 intel the speed difference between crafty's move generator >>>>>and diep's move generator is somewhat more than 3 times. >>>>> >>>>>Crafty is more than 3 times slower, because of bitboards, which perform >>>>>badly at 32 bits. Note that the more possibilities, this 3 times will >>>>>become >>>>>4 times. >>>>> >>>>>For something taking 1 clockcycle in my program and which can >>>>>be even pipelined in U + V pipe, >>>>>crafty needs sometimes even a whole function. >>>>> >>>>>Greetings, >>>>>Vincent >>>> >>>> >>>>yes... but try to generate nothing but captures and compare that, which >>>>is a major part of the total tree search... >>> >>>Where do i need to generate captures for? >>> >>>But to answer your question: 2.5 times faster. >> >> >>Because in ordering alpha/beta moves, captures are searched first. And > >this statement is already naive. > >I search a move that gives a cutoff. Sometimes a capture, sometimes a >check, >sometimes a promotion, depending on what move is gonna give me the >cutoff. > >I think we directly here see why Diep's branching factor is that well. >I don't have a special order. I pick simply the move that is very >probably gonna give me a cutoff. > >>they cause most of the cutoffs. If you just produce captures, you don't >>have to continually wade through the non-captures picking out capture >>moves to try... >> >>and the best test to try is to set up a position in Crafty, and use the >>"perf" command... set up a position such as kopec22 which is a >>reasonable >>early middlegame position, and type "perf". On a single processor >>pentium >>pro, that produces the following: > >We cannot compare our processors. I have a P133 at the same speed, >but i don't have a Pentium pro SDRAM with 512 kb cache at a fast >mainboard, >like you have. > I don't have SDRAM nor 512KB cache. My ALR uses FPM memory and each cpu has 256K of cache... just like yours probably... except your memory is probably a bit faster (EDO) vs my fast page mode memory. But my Orion chipset for 4 processors only supports FPM for interleaving... >I have a P133 laptop with a processor which is exactly 133Mhz, >and i have a pro 200Mhz 256kb cache with edoram, which is way slower >than >your pro. Wrong, as I said... Yours is actually a little faster, as my old Gateway with EDO ram was a couple of percent faster than my current machine... > My datastructure doesn't fit within 256kb cache, except when >i'm >just generating moves... ...so that'll be a fair compare except for the >SDRAM. >I'm not sure what influence it has on just generating moves over and >over again. > >>Black(1): perf >>generated 3300000 moves, time=1.38 seconds >>generated 2391304 moves per second >>generated/made/unmade 3300000 moves, time=4.94 seconds >>generated/made/unmade 668016 moves per second > >We must not forget that if i generate moves (3 times faster), that >i store an entire structure for every move, where you only do > >*move++ = int; > >We should take this into account. no reason... You said "you can generate moves 3x faster". I gave you a number of moves I can generate per second. If you can't generate them that fast, then you aren't 3x faster. I do things you don't do... you do things I don't do. But we both generate moves... > >I'm storing a lot of different stuff with a move which gets used in the >move ordering and evaluation. > >I'm now here under UNIX at a slow HP9000 model 712 (60Mhz). >When i'm at home i do this test special for you, where i shall replace >that >storing of datastructure. > >>IE I can generate all moves from that position for white, at 2.4M moves >>per second, or I can generate all the moves, and then make each one, and >>do that at 670K moves per second... > >>What's your speed? here is the position, diagram and FEN: > >We should compare branching factor, that would be nice. >You may use your 512MB SDRAM 4 processor against my pro200, >also nice would be to do this generating of moves at a K6 or P133, which >clearly will show that factor 3 to 4. you keep saying that... but you are wrong. Again, no SDRAM, which wouldn't matter anyway until someone gets a BX motherboard that can use the faster SDRAM cycle time for the second through fourth words of memory. And I don't have 512K of cache... they were too expensive when I bought this machine. > >>Black(1): d >> >> +---+---+---+---+---+---+---+---+ >> 8 | | | *R| | | *R| *K| | >> +---+---+---+---+---+---+---+---+ >> 7 | | *B| *Q| *N| *B| *P| *P| | >> +---+---+---+---+---+---+---+---+ >> 6 | | *P| | *P| *P| *N| | *P| >> +---+---+---+---+---+---+---+---+ >> 5 | *P| P | | | | | | | >> +---+---+---+---+---+---+---+---+ >> 4 | N | | P | | P | | | | >> +---+---+---+---+---+---+---+---+ >> 3 | P | | | B | | N | | P | >> +---+---+---+---+---+---+---+---+ >> 2 | | B | | | Q | P | P | | >> +---+---+---+---+---+---+---+---+ >> 1 | R | | | R | | | K | | >> +---+---+---+---+---+---+---+---+ >> a b c d e f g h >> > >Cool this has a lot of possibilities. That'll go with many many million >moves >a second! > let's see your numbers. But you can't take anything out. Otherwise I have a lot of stuff I'll take out too, particularly in MakeMove() which is the second set of numbers above... And use your pentium pro since it is only a touch faster than mine, then we can compare real numbers... >>2r2rk/1bqnbpp/1p1ppn1p/pP/N1P1P/P2B1N1P/1B2QPP/R2R2K b
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.