Author: Robert Hyatt
Date: 14:09:33 06/21/02
Go up one level in this thread
On June 20, 2002 at 13:29:36, Keith Evans wrote: >On June 20, 2002 at 13:03:10, Robert Hyatt wrote: > >>On June 20, 2002 at 12:30:47, Keith Evans wrote: >> >>>On June 19, 2002 at 23:27:51, Robert Hyatt wrote: >>> >>>>On June 19, 2002 at 20:45:33, Keith Evans wrote: >>>> >>>>>On June 19, 2002 at 14:33:56, Tom Kerrigan wrote: >>>>> >>>>>>On June 19, 2002 at 13:10:42, Robert Hyatt wrote: >>>>>> >>>>>>>I don't care about the 32 bit specint. I care about the fact that a >>>>>>>1.4ghz pentium runs Crafty at about 750K nodes per second. The 600mhz >>>>>>>21264 ran it at over 800K. And 600mhz is _not_ the fastest 21264 around. >>>>>>> >>>>>>>The 1ghz mckinley runs it twice as fast as that 1.4ghz pentium, 1.5M nodes >>>>>>>per second. _that_ is definitely "something to get excited about" IMHO.. >>>>>> >>>>>>So you like the 21264 and the McKinley. That's great. Maybe you can start a fan >>>>>>club, instead of posting to a thread where people are trying to have an >>>>>>intelligent conversation about 64-bit computing. >>>>>> >>>>>>-Tom >>>>> >>>>>Is there an easy way to compare a 1.4 GHz P3 to a 1 GHz McKinley and see where >>>>>this Crafty performance increase is coming from? I'm not at all familiar with >>>>>McKinley, but would it be possible to run a version of Crafty compiled for >>>>>32-bits on a McKinley and compare that to a Crafty compiled for 64-bits on >>>>>McKinley? Is this a dumb idea? If this isn't possible, then it's going to be >>>>>difficult to tell where the performance gain is really coming from. >>>>> >>>>>-Keith >>>> >>>>I don't know that you could do this. It would require that the compiler know >>>>how to implement 64 bit ints as 2x32 bits, which on a mckinley would be a waste >>>>of the compiler-writer's time... >>> >>>So what's wrong with Tom's suggestion - "You can make a bitboard class that >>>contains two 32-bit ints and overload all the int operators and run it on a >>>64-bit chip. ... renaming his source files from .c to .cpp and writing this >>>simple class" >>> >>>Is this is a valid experiment then why not do it and settle the argument? >>>Someone might even consider publishing a paper on it. (Maybe Tom would volunteer >>>to hack the code if you don't want to bother?) If it's not valid then what's >>>wrong with it? Do you think that the compiler would outsmart Tom and use 64-bit >>>words for the bitboards anyways? >>> >>>What would your prediction for such an experiment be? That the version with 2 x >>>32-bit bitboards would run half as fast as the version with 64-bit bitboards? >>> >>>I'm pretty sure that we could find some willing volunteers to run some simple >>>experiments on their hardware. >>> >>>-Keith >> >> >>It could certainly be done. However, I don't see what it would prove. >>Other than that 64 bit operations are more efficient when done in one >>"chunk" than in two. That seems intuitive anyway. It would also present >>a few problems, with the FirstOne() and LastOne() PopCnt() functions that >>use assembly on the PC but not on the 64 bit machines (yet). >> >>Remember that my comparison results are for the "best" 32 bit crafty I have >>vs the 64 bit machines. Unfortunately, the 64 bit machines have no assembly >>so they are at a significant disadvantage to start with, yet they are >>blazingly fast. When I have access to one for a period of time, that >>advantage will go away completely. Then it will be 32 vs 64 in a _real_ >>comparison, not a biased-toward-32-bit mode as it is done today. > >I think the issue is that there may be other things (besides the word length) >changed on the 64-bit processor which will also improve performance. I hope that >at the very least you would also run other chess programs (non-bitboard) on the >64-bit processor and see what type of speed increase that they get versus older >32-bit processors. > >I think that the point of the experiment that we proposed "what it would prove" >is what portion of the Crafty speed-up is due to the bitboard structures, and >what portion is due to other factors. > >-Keith The idea is good. But the implementation is flawed. We need a compiler that can produce identical code, except that bitboard ops are done by either 64 bit hardware or by 32 bit operations. But all else stays the same. Unfortunately, compilers don't behave like that. Most likely the 32 bit operations will be poorly optimized, or even flawed. Because what compiler writer would _really_ want to do both well when the machine is really 64 bits? I've done my share of compiler work, and the optimizer is a _lot_ of work. I spent the time on the parts that made "native" code work the fastest. I didn't care about oddball cases. When an experiment has no useful output, there is little point in doing it. We assume the 64 bit stuff is done efficiently. If the processor has lots of pipes, it might be that doing things in 32 bit chunks is just as fast, if the pipes are mostly idle in the 64 bit program... if the 32 bit code is ugly, the 64 bit program might be way faster. Or barely faster. If the 32 bit code is very good, the 64 bit program might be much faster. Or a little faster. Too many variables. No way to conclude much of anything. I tend to rely on logic in such cases. Whenever I can replace two (or more in the case of shifts or adds/subtracts) instructions by one, I tend to do that on general principles. Obviously 64 bit hardware can't be slower on 64 bit programs than on 32 bit programs. Obviously, when you think about the machine instructions to do things like AND/OR/XOR/SHIFT/etc you produce fewer instructions for 64 bit machines than you do for 32 bit machines. Fewer instructions -> fewer memory references, 2x the thruput for those 64 bit ops vs 32 bits (>2x for some like shift). It seems obvious to me that 64 bit machines are better for 64 bit programs than they are for 32 bit programs. How much obviously depends on how the 64 bit program depends on 64 bit operations. IE I can't imagine that Crafty does less than 25% 64 bit operations. That represents a significant performance boost. And as time goes on, more 64 bit operations will be added since the speed penalty will be less.
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.