Author: Andrew Dados
Date: 13:46:31 04/25/00
Go up one level in this thread
On April 25, 2000 at 16:27:46, Dan Newman wrote: >On April 25, 2000 at 14:57:28, Andrew Dados wrote: > >>On April 25, 2000 at 07:24:38, Dan Newman wrote: >> >>>On April 25, 2000 at 06:50:18, Andrew Williams wrote: >>> >>>>On April 25, 2000 at 05:59:45, Bas Hamstra wrote: >>>> >>>>>On April 25, 2000 at 01:33:45, Tom Kerrigan wrote: >>>>> >>>>>>On April 25, 2000 at 01:32:48, Tom Kerrigan wrote: >>>>>> >>>>>>>On April 25, 2000 at 01:20:11, Bas Hamstra wrote: >>>>>>> >>>>>>>>(Currently I myself have started all over trying to make a really fast rotated >>>>>>>>BB program. Looks promising so far: make/unmake=2M/sec and movegeneration=12M >>>>>>>>moves/sec on a Celeron 466) >>>>>>> >>>>>>>My program does gen/make/unmake at 2M/sec, and just gen at ~10M/sec. >>>>>>> >>>>>>>I'm not using any sort of bitboards. >>>>>>> >>>>>>>Maybe something to consider... >>>>>>> >>>>>>>-Tom >>>>>> >>>>>>Whoops, forgot to mention that I have a Celeron/400. >>>>>>-Tom >>>>> >>>>>I am not very familiar with these type of statistics. What exactly does that >>>>>"gen/make/unmake" mean? Gen all moves and then making/unmaking each? But what >>>>>does it SAY..? I am afraid not much for my type of program. Here is why. >>>>> >>>>>- I have distinct functions gencaps() and gennoncaps() >>>>>- Profiler says caps() is called 10x as much called as noncaps() >>>>>- So 90% the time all the work of gen() is only to find a capture that gives a >>>>>cutoff. Regardless if you distinct them or not. >>>>> >>>>>So you can maybe extremely fast gen+make+unmake all moves. But what *I* want to >>>>>know is "how many times/sec" can I do a capgen(), since that has to be done >>>>>practically *every* node. >>>>> >>>>>And since it don't seem to matter a lot timewise if I generate 3 or 5 captures, >>>>>I simply measure the above TotalCapGens/sec. Maybe we can compare (same hardware >>>>>anyway): >>>>> >>>>>after e4 e5 Nf3 Nf6 Nc3 Nc6 Bc4 Bc5 >>>>> >>>>>I can do a CaptureGen() 800k times a second. I am interested in what others >>>>>have, especially 0x88 and non-rotated BB programs. >>>>> >>>>> >>>>>Regards, >>>>>Bas Hamstra. >>>> >>>>Hi >>>> >>>>The fen string for the position you describe is: >>>> >>>>r1bqk2r/pppp1ppp/2n2n2/2b1p3/2B1P3/2N2N2/PPPP1PPP/R1BQK2R w KQkq - 0 5 >>>> >>>>PostModernist is a 0x88 program, but it keeps up-to-date 32-bit attack >>>>records, which it uses for capture generation (amongst other things). PM >>>>can call generate_captures 510k times per second on a PIII-450 in this >>>>position. I'd be very interested in other programs' performances. >>>> >>>>Andrew >>> >>>My 0x88 program gets 830k capture generations/s (10M iterations) on a >>>Cel-400, but I'm not doing anything fancy, like attack records... >>> >>>-Dan. >> >>Speed or raw 'capture generation' is not necessarily important in 0x88. >>My program does about 160k 'full capture' generations/second, generating in near >>MVV/LMV order, but one victim at the time, starting either with 'killer capture' >>or last moved piece. >> > > >I wrote an MVV/LVA order capture generator for the 0x88 program, but it >suffered from the N-squared problem: when there were 32-pieces on the board, >it had to examine 16x16=256 different capture posibilities, but it was much >faster than the normal capture generator when there were only a very few >pieces on the board. (It used the coordinate-difference trick to quickly >skip victim-attacker pairs for which capture was impossible.) 10x15 *max* capture possibilities (when no promotions): (8 attacking pieces + 2 possible pawn captures) x 15 victims....which is 150 coordinate-difference lookups. With move-like generation of all captures you get somewhere around 70-100 tests (including off-the-board and color-of-piece tests) for all piece set. > >But, if instead you emit a capture at a time, and try it before emitting the >next, then it seems like it could be a win... > > >>However for about 1/2 nodes (FH nodes) it never gets past first one or two >>MVV... also in qsearch it skips victims which have less value then needed to >>meet a bound, thus effectively making some 500-600k 'neccesary to cutoff' >>generations/second. >> >>Generating all captures at once then sorting the list may be much slower, or >>little faster, depending on number of pieces on the board (according to my >>tests, that is). >> >>All above numbers for AMD k3-450. >> >>-Andrew- > > >Hmm. This is very tricky. If you do a SEE, you get slightly better move >ordering than you get with MVV/LVA (not a awfull lot in my tests but >enough to make it slightly worth the extra cost in a moderately deep >search). OTOH, if this "just-in-time" capture generation speeds things >up enough, it could reach break even, or even exceed it (in, say, tournament >time controls). But, at some point, if you go deep enough, better move >ordering should win out even if it costs a whole lot more (per node). >That may not matter however if it takes an hour to get that deep... I still do some limited SEE in mine, and also try to reuse victim which gave me cutoff info (I clear it 2 plys below). Sorta 'killer capture'... can be even better then SEE stuff at times. -Andrew- > >-Dan.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.