Author: Chris Whittington
Date: 04:45:32 11/28/97
Go up one level in this thread
On November 27, 1997 at 22:30:40, Robert Hyatt wrote: >On November 27, 1997 at 10:30:29, Chris Whittington wrote: >> >> >>So, hacking your numbers around, you appear to suggest: >> >>1. 32-bit P6 to 64 bit alpha architecture change gives Crafty about 80% >> > >I'm not sure. All I can say for sure is Bruce gets 1.75X on the 533 >machine, I get 3.1X on the 500 machine. If we assume that 1.75 is pure >machine cycle time, ignoring 32 vs 64 bits, then the rest is likely a >result of simply using 64 bit words effectively. > > >>2. With some care and optimisation, you reckon you could get this to, >>say 125% > >I believe so, yes. IE I know of a faster PopCnt() function for the >alpha, for example. And as this turns into a hardware operation (which >is rumored for the 21264) there is a lot to gain, yet... Except I was trying to get at the 64 bitmap vs offset performance. The alpha just happening to have in the future a neat bit counter isn't a function of 64 bitmap approach, but of alpha vs PC, which is another issue. Presumably an offset program would get the same improvement on an alpha (and CSTal is full of bit counting). > > >> >>But this doesn't imply 64 bitmap approach for chess is approx 100% >>faster than 32 bit for chess, does it ? Because the 32 bit Crafty code >>is an emulation of 64 bitness bitmaps, not code that is written in the >>'previous' style, of say having a byte or a word or whatever per board >>square. > > >I personally think that algorithm for algorithm, offset and bitmap are >equal. Interesting. This may be. But lets take a knowledge/speed extreme example for mobility (i'll assume that the calculations aren't done incrementally) 1. We go for speed. Bit map approach is neat, you just do some count bits. Offset is slower, we need to trawl through all the squares, looking for attacks. 2. We go for intensive knowledge. We say mobility is some function of SAFE movement capability into each square. Now, bitmap or no bitmap, we have to process each square for safety. Basically a swap-off or exchange evaluation for each piece movement into each square. Isn't offset faster, because we already have the attack/defence bits ranked on each square, smallest attacker value first ? Isn't it the case that bitmaps are neat if we just want to quickly quantify attacks (basically just add them up) ? But as soon as we want to make qualitative judgements about each attack and how the attacks/defence interact with each other, then the offset data seems more accessable ? >I have to pump 2 32bit words thru the P6 for everything I do, >but I get some efficiencies that make up for this. But on the alpha, >you >are *forced* to pump 64bit quantities around, and if you are forced to >do >this, it obviously pays off to use the full 64bits if possible, to avoid >pumping values around that have double the number of bits you really >need. If there is some advantage, but not just for the sake of it. You woudl be unlikely to argue the case of 128 bits over 64 bits on this basis, would you ? I thought the big deal of 64 bits was that it equalled the 64 squares on the chess board. > >I have this vague feeling that bitmaps are going to be superior on 64bit >machines. Based on the problem of pumping/using 64bits, vs pumping 64 >but >using only 32, which means 1/2 of the data throughput is wasted, >totally. As above, why not argue for 128 over 64. Again 64 is the magic number, no ? > > >> >>The next interesting question, and one that is again semi-impossible to >>answer, is just how much better is it (as measured by nps) to be using >>bitmaps than not. Looks like the upper bound on 'better' is something >>less than your 125%. One wonders how much less .... ? > >The question is "how many instructions per node?" I suspect. And based >on >my analysis, this answer is potentially lower for bitmaps. IE, for a >simple >example, what is involved in your asking "is the pawn outside the square >of >the king?" I do it in one memory load for a mask plus an AND operation >and >a test. IE it takes so little time I can't measure it. I think you can find many examples of quick and easyness. But my case is that intensive knowledge processing is probably better with offsets. And this argument then gets taken further. If you select or promote the bitmap paradigm, you are already promoting and selecting a design. Namely that of fast searcher with less qualitative knowledge ........ > Many things >turn out >to be as efficient, on a 64 bit machine. On a 32 bit machine, I take a >hit >because everything is done twice. For Crafty and other 64 bitmap programs, yes, this is true, you take a hit by trying to emulate 64 bitness on a 32 bit machine. But isn't it ever so slightly rich to extol the virtues of 64 bit on this basis ? > This is not bad on a superscalar >machine >like the P5/P6, because the inefficient double stuff simply serves to >feed >multiple pipes in parallel. So I keep the pipes better fed than most >programs, >because of this. But it isn't optimal. On the 64 bit machines, I lose >the >duplicate operations, which gives me an almost free 2x performance >boost. But >not using them isn't a kiss of death. But it might end up being a >disadvantage >until someone finds a new approach that is non-bitmap based, but which >needs >to pump 64 bit values around. > > >> >>Also of consequence is that these figures are dependant on working to >>the Crafty-design paradigm. Loosely, I seem to recollect that you (or >>was it a Bruce description of Ferret?) create piece movement bitmaps, >>and for sliding pieces, these bitmaps don't account for blocking pieces. >>Like, for example, if you want to know if a piece is attacked, you pick >>up the enemy movement bitmaps, test for a hit on the square in question, >>and, if its hit by a sliding piece, THEN test for a blocking piece. This >>seems fine, good and fast, if your program is not needing to do many >>piece attack queries; becuase if it is, there comes the point where it >>is probably better to preprocess all of these beforehand, and then not >>have to waste time over and over in doing block piece tests ........ ? > >first, the "blocking" test doesn't apply. I take a diagonal's contents, >index into an array, and get a bitmap showing what squares this "piece" >attacks on this diagonal. One memory reference and I am totally done. When do you do this ? Beforehand for all square attacks ? Or as you need to know ? Again knowledge intensity would require it done beforehand. Fast and easy would allow it to be done on the fly. >If >I am interested in "battery" conditions and so forth, it is not hard. >IE >I haven't found anything I wanted to do Well this is the key, no ? For the Crafty paradigm it works. For the knowledge paradigm, I think it gives problems. Your way promotes a certain sort of chess engine ..... > that wasn't just as easy (or >even >easier) than what I did in Cray Blitz, which was *not* bitmap based at >all. If I used each piece's attacks repeatedly, I'd likely avoid all >the >duplicate work, as a standard memory-reference optimization, because all >of >our machines are memory-starved and are getting worse every year... > >However, the bottom line is I don't think there are any things you can >do >that I can't, nor any things I can do that you can't. Its software, so we all can do anything :) But the question is how effectively, and does one way actually steer the designer down specific pathways ? >The ideas are >somewhat >complimentary. In fact, I have a non-bitmap board representation in >crafty >(board[64]) that I use for some specific things that are easier to do >than >trying to do them with bitmaps... > >>Or, what I'm trying to say is that the Crafty results probably don't >>cover highly intensive evaluation functions ..... ? > >I can't answer. IE I do mobility at zero cost. I do the pawn square >thing >at zero cost. It takes some thought to get into bitmaps, but I don't >see >anything I did before that I can't do now, nor do I see anything that I >do >now that I couldn't have done before. If performance is the issue, I >suspect >bitmaps on 64 bit machines are going to be very hard to beat. But if >you are >interested only in knowledge, I don't see why they wouldn't work just as >well >as the traditional offset board representation, and they still should >give you >a performance "kick"... One concern (repeated here) is that the bitmap doesn't inherently store the attacker significance. So if we want to find the smallest attacker we have to hunt around the bits first, and so for the next smallest etc. For qualitative work on attack/defence of a square this is going to be a drawback. > But notice that it takes a lot of time to get >"into" >bitmaps and start thinking bitmaps. And after that point is reached, >they are >no more obtuse or difficult to use than any other programming paradigm. >And >using macros can produce some pretty readable code and hide all the >ands/ors/ >etc... Agreed. Switching paradigms gets to be difficult. Made worse by the amount of time investment in the previous way of doing things; and by the age of the person involved. If you're young and you haven't spent all your time doing it one way, its a lot easier to make the switch. But I still make the point; the design compromise made to be bitmapped ot offseted is going to have major consequences for the knowledge/speed decision. Chris Whittington
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.