Author: Vincent Diepeveen
Date: 17:54:41 06/06/02
Go up one level in this thread
On June 05, 2002 at 22:01:30, Robert Hyatt wrote: PERHAPS IT IS TIME YOU PROFILE CRAFTY AGAIN. Previous run you did must have been 10 years ago or so that you guess it is needing 50% system time for its evaluation. The position where evaluation takes most system time is the opening of course. A long profile run from there is most 'lucky' from crafty's viewpoint. No way i can let it eat more than 42% of the system time for *all* the evaluation functions together. Note that the pawn structure eats very little system time here. Perhaps taking many MBs for pawnhashtable size is a bit big? I took 96MB for hashtable and 12MB for pawntable or so. In short in the far endgame it'll be more like 20% of the system time going to evaluation. Note that this was a parallel compile, but not a parallel run. So in reality some overhead is wasted to that too, but i see that as a loss everyone loses anyway. >On June 05, 2002 at 13:32:52, Vincent Diepeveen wrote: > >>On June 05, 2002 at 04:17:12, Bas Hamstra wrote: >> >>you forget to mention evaluation. Seems you guys forget >>that chess is about evaluation of a position. You need >>so much system time for SEE, Makemove and unmaking >>moves that it seems you simply have *no time* for evaluation! >> >>If the capturing routine/SEE in qsearch is eating all of your system >>time, then my advice is to NOT use a qsearch, but to use an evaluation >>that directly estimates what you could possibly lose. Just 1 piece >>of course. Remove that from the score then. >> >>That's a few clocks more and your thing gets a few million nodes a second, >>but for sure searches 3 ply deeper. > >My results here are already well known. Crafty spends nearly 50% of the total >time in Evaluate() and its sub-functions. SEE, etc are all very small parts. >I see Evaluate range from a low of 33% of total search time to a high of just >over 55%. Note that in the profile code you have to look carefully to get >all of the individual parts of Evaluate() and not just Evaluate() by itself. > > > >> >>>On June 04, 2002 at 20:31:41, Robert Hyatt wrote: >>> >>>>On June 04, 2002 at 18:01:03, Gian-Carlo Pascutto wrote: >>>> >>>>>On June 04, 2002 at 17:52:47, Dann Corbit wrote: >>>>> >>>>>>On June 04, 2002 at 16:28:39, Gian-Carlo Pascutto wrote: >>>>>> >>>>>>>On June 04, 2002 at 16:18:55, Gian-Carlo Pascutto wrote: >>>>>>> >>>>>>>>Because you are using a processor that is clocked at twice the clock >>>>>>>>frequency? Why compare a 1ghz processor to a (nearly) 2ghz processor >>>>>>>>and conclude anything about efficiency there? Is there anything that >>>>>>>>suggests that the alpha is simply more "efficient"? To justify that >>>>>>>>clock frequency disparity? >>>>>>>> >>>>>>>>A machine twice as fast (clock freq) _should_ perform just as well as >>>>>>>>a 64 bit machine at 1/2 the frequency... Less would suggest that the >>>>>>>>32 bit machine simply sucks badly. >>>>>>> >>>>>>>I don't agree with the validity of a clock-for-clock comparison, >>>>>>>but if you want to do it anyway, I'll again point to Vincent's >>>>>>>numbers: >>>>>>> >>>>>>>At the same clockspeed, Crafty only gets 33% faster on the 64-bits >>>>>>>machine. >>>>>>> >>>>>>>When you read this, keep in mind that most applications get _more_ >>>>>>>than 33% faster on the 64-bits machine. >>>>>> >>>>>>All the new 64 bit chips in the discussion are pretty much beta stage right >>>>>>now. >>>>> >>>>>Not true for the Alpha. >>>> >>>>Depends on the alpha being discussed. DEC had processors beyond the 21264 >>>>running. Although the 21264 was pretty good. Dann was a bit off on the >>>>performance as Tim Mann was running a 21264 at 600mhz and getting right at >>>>1M nodes per second. Mckinley is getting 1.5M at 1000mhz, so the alpha might >>>>have a bit of an advantage still. but it is pretty small... >>>> >>>>Mckinley is only available to a select few. 21264's are fairly common. >>>>Anything beyond that is not readily available... >>>> >>>> >>>>> >>>>>>So, I think that architecturally, it makes good sense to design for a 64 bit >>>>>>system right now. >>>>> >>>>>That makes sense, if the 64 bit design is actually faster than the corresponding >>>>>32 bit design (even on 64 bit hardware if you wish). >>>>> >>>>>The case for bitboards is not clear on that matter. Certainly, if >>>>>the speedup over nonbitboards is only 33% they will have a hard time >>>>>convincingly beating alternative appraoches even on 64 bit hardware. >>>>> >>>>>-- >>>>>GCP >>>> >>>>You are assuming that bitboards are _slower_ than non-bitboard programs on >>>>32 bit machines. I haven't seen this demonstrated yet. We can always do some >>>>sort of a test. IE since the most common move generator issue is "generate all >>>>captures" we can try that with bitboard and non-bitboard approaches to see if >>>>one is really much better than the other on 32 bit machines. I don't think so >>>>myself. I think they are pretty equal due to the multiple pipe issue. >>>> >>>>But a test could be done to see, since this is the most common thing needed >>>>in a chess engine. >>> >>>That's not a fair test, I think. IMO the most heavily used routines are: >>> >>>- See() >>>- GenCaps() >>>- SquareAttacked() >>>- Make/Unmake() >>> >>>You just pick the one in which bitboards is good. In fact it is nearly >>>impossible to figure out what's best overall by comparing only parts. What you >>>could do though, is generate "profile data" about a search in average middlegame >>>positions, and see how many times each of the above functions is being called. >>>Then we could turn this into a sort of benchmark: >>> >>>10.000 * a() >>>8.000 * b() >>>3.000 * d() >>>5000 * c() >>> >>>and compare times for bitboards and 0x88 to do this. This would at least tell us >>>if bitboards is faster *for Crafty*. >>> >>> >>>Best regards, >>>Bas.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.