Author: KarinsDad
Date: 08:07:26 10/31/99
Go up one level in this thread
On October 30, 1999 at 16:34:54, Ratko V Tomic wrote: >> Though what you say is true, you are forgetting that you have to take >> the worse case scenario in order to find a minimum number of bits for >> a maximum bit intensive position. Worse case scenario is white king on >> e1, white rook on a1, white rook on h1, black king on e8, black rook >> on a8, and black rook on h8. In this case, the 4 bits are required (with >> the Huffman type of schema). > >Yes, but this scenario applies to only a small fraction of positions >and my point was precisely that one doesn't need to replicate all 3612 >placements when only a small fraction of the king pair placements >allows for castle possibility. Hence, I am assuming that the en/de-coder >will first decide whether it is a non-castle case based on position, >then for the remaining 357 combinations use free codes out of the 484 >unused 12 bit codes (which are not needed for the king pair placement). Ok, I think I see where our miscommunication is occurring. I am discussing the ability to compress any legal random position into less than or equal to 160 bits. This is easy to achieve for any "normal" position that occurs between two players who are both attempting to draw or win. However, for positions with zero pawns and twelve promoted pieces and no captured pieces (which would never even come close to happening in a real game), it starts becoming more difficult. I am not discussing the best way to compress a series of positions into a database or any other such mechanism. > >> Yes, of course. This is the 12 bit king proposal I mentioned in my previous >> message. >... >> However, you are again forgetting worse case scenario. For the 12 bit king/king >> state schema, worse case is that castling cannot be done for either side. Hence, >> you have to indicate where the rooks are (in a worse case position). > >It can't be the same proposal if you need more than 12 bits for king >pair and arbitrary castle state. It is the same proposal. You have to read the entire thread here. There were two algorithms mentioned, one for 5 bits per king plus 2 bits for castling state information for 14 bits; and another for 12 bits for both kings including the castling state information. What I was trying to state with my previous paragraph is that the 12 bits CAN tell you where a rook is IF castling is allowed and in these types of cases, it saves bits. However, for reduction purposes of the worse case bit pattern, there are positions where castling is not allowed for either side where all 4 rooks are still on the board and hence, they will take up space within the bit pattern, one way or another. The reason 12 bits is enough space is >that the worst cases apply only to a small fraction of the king pair >placements. Hence if decoder gets number 0..3611 it is a non-castling >position (regardless of where the kings are). Then, the next 58 codes, >3612..3669 specify that white king is on e1 with only kingside castle Ok and >the black king on any of the 58 available squares without castle rights >(as suggested earlier, this can be reduced some more by eliminating >other illegal castle cases due to attack/blocking by the 2nd king, but >that wasn't needed to fit it all in 12 bits). Yes, when I worked on this, I observed this myself. However, I could not come up with a reason to reduce it since there were not enough bits left over to do anything real with them. Then the next 58 codes, >3670..3727 specify white king on e1 with qeenside castle only allowed, >and the black king on any of the available 58 squares without castle >rights. Finally the next 58 codes, 3728..3785 secify white king on e1 >with both king and queenside castle available and black king anywhere >else without castle rights. That uses up 3*58=174 codes. You need >another 174 codes for reverse colors. And finally there are 9 cases, >with white king on e1 and black king on e8, and both kings with some >type of castling rights. Hence all cases where at least one king has >some castling rights use up 174+174+9=357 codes. Since all cases with >no castling rights use 3612 codes, you need 3612+357=3969 codes for the >whole shebang, which is about 0.05 bits shorter than the target space, >the 12 bits. > >Perhaps I am not following some other constraint you considered for your other >objectives, but if someone gives me any position of 2 kings and any combination >of their castling rights I can put it all into 12 bits and restore it later >exactly (since there are at most 3969 such positions). That's the only >constraint or objective I had assumed. (Using the full placement of other >possible pieces, or even a subset of them, can only reduce this encoding to >fewer than 12 bits as discussed below, if for no other reason but for reducing >the number of available squares.) Of course you can. I was merely pointing out that for my purposes, this saves me bits for castling status, but does not really save me bits for the rooks (since you can always come up with an identical position when castling cannot occur that you can for when castling can occur). > > >>> in addition to the 2 kings, one can further reduce the size of the combined >>> cluster of these 2-6 pieces with castling status (any castle-ok status also >>> defines the content and the attack status on the squares in between the >>> king and the rook, giving thus additional savings). >> >> This is not correct. A protected black knight can be on b1, preventing white >> queenside castling. However, if the knight moves to a3, castling can then be >> done (assuming b1 is the only backrank square being attacked on the queenside). >> The castling state has to be preserved and you cannot make an assumption as to >> what piece can or cannot be on the backrank, nor can you make assumptions on the >> attack status on the squares in between (which doesn't affect the right to >> castle overall, merely the right to castle at the moment). > >I think this is again the same case of miscommunication as for the king >pair + castle fitting into 12 bits. My semantics for word "encoding" is >a bit more general than what you use that word for, allowing for finite >state machine and synchronized parsers on encoder and decoder. Hence I >allow for any procedure, including multipass one, as well as legal move >generator, which can be uniquely decoded. Therefore, in a multipass >algorithm, where at any point the algorithm remembers (and can make any >suitable use of) _all_ the info processed so far, one can not only make >assumptions, but know exactly, the attack status of any square, as long >as the earlier pass (or generally, the decoding process so far) has >produced the sufficient piece placement info to generate the needed >legal moves. When you are talking a database of associated positions, these type of techniques work fine. But, for a random legal chess position, the legal move generator will not work since it takes up too many bits. A multipass decoder has a chance of working, however, I have yet to come up with a reason to do it in this manner. As to how that would go in a chess encoding, I haven't >really worked on that problem at all to give you full specific state >machine, but over the years I have written various compression >algorithms, the standard ones and several I invented (which are the >clients trade secrets at present e.g. the code on the AOL servers used >for domain name to IP transation data base, as well as the >filtering/blocking [for kids access], is my compression code and the >compressed data base search). So, having dealt with encoding and >compressing the same kind of irregularly constrained enumerations, I >know it can be done for chess positions, too. Agreed. I have also worked on compression routines for graphics, so I know it is doable. The problem is with the small amount of data which does not lend itself well to standard compression routines, but rather to a chess rules specific set of routines. The king+castle in 12 >bits was merely an illustration of this type of state machine based >encoding, which I thought wasn't utilized as efficiently as possible in >the proposals presented here earlier (since it stated that extra castle >bits are needed in addition to the 12 bits for king pair; the encoding >I suggested was merely a counter-example for that assertion). And a good one it is. In fact, it is in my current algorithm. I have tried a wide variety of ways in which to shorten the number of bits for those hungry kings. For example, store the king as a bishop and then at a later time, attempt to determine which piece is the king. However, with pawn promotions, that could be 5 to 11 bishops/kings of each piece color on a given colored square and it would take 6 bits to sort out which 2 are the kings. Given that it took 8 bits to place the "bishop/king" on the board, that is a total requirement (worse case) of 14 bits which is 2 more than the 12 bit king storage (and it doesn't include castling status). None of these "tricks" that work for normal positions work for massive pawn promotion positions. > >Now, in specific case of combining codes for rooks and king pair >together, the saving comes not from relation to other possible peaces, >but from the observation that the suggested 12-bit encoding for king >pair+castle, while rooks get encoded separately (with other pieces & >squares) is obviously wasteful since those 357 codes which specified >castle would combine with the independent code for the rest of pieces, >hence they would have reserved codes space for many illegal >combinations of rooks placement with any of 357 castle-ok cases from >the 12 bit code. In fact, when any of the king castle states is true, >the valid rook placements are reduced drastically. So my suggestion >above was to enumerate all valid 0+0...2+2 rooks and 1+1 kings >placement and allocate codes for castle rights only in a small fraction >of the possible geometric placements of these 2-6 pieces, the same way >we did for king pair: there were 3612 mutually non-attacking king pair >placements, but out of them, at most 2*58-1=115 geometric layouts were >compatible with some castle rights, hence only these cases needed >additional codes (357 codes were needed to cover all combinations of >castle rights for these 115 layouts). The same type of saving would be >obtained by combining rooks with kings. No, this is incorrect. The reason is simple enough. Since the 2 kings take up 2 squares, that leaves approximately 62 squares for the a1 rook, 61 squares for the h1 rook, 60 squares for the a8 rook, and 59 squares for the h8 rook. Dropping illegal positions for the moment, that results in 13388280 or just under 24 bits for the rooks or 6 bits per rook. Let's do the real math: Method 1: 12 bits for kings and castling 4.33 bits for rooks * 4 rooks (this includes color and position) = 29.33 bits Method 2: 12 bits for kings and castling 5 bits for rooks * 4 rooks (this includes color and position when we cannot get away with a minor piece compression, i.e. many pawns) = 32 bits Note: Method 1 and 2 are the methods I use, depending on which one compresses more (with a lot of pawns, the second method actually compresses better). Method 3 (your method, note: white here will be side to move, so I will do reductions for illegal white rooks): 240 (black king in corner 4 * 60, no castling) * 62 black rook * 61 black rook * 58 (reduced from 60 since the white rook cannot be right next to the black king since it is white to move) white rook * 57 white rook = 3000790080 1392 (black king on edge 24 * 58, no castling) * 62 black rook * 61 black rook * 57 (reduced from 60 since the white rook cannot be right next to the black king since it is white to move) white rook * 56 white rook = 16804424448 1980 (black king in interior 36 * 55, no castling) * 62 black rook * 61 black rook * 56 (reduced from 60 since the white rook cannot be right next to the black king since it is white to move) white rook * 55 white rook = 23064148800 56 (white king on e1, queenside castling, black king cannot be next to white rook) * 61 black rook * 60 black rook * 59 (not going to do a reduction here since this value does not greatly impact the final result and this entire portion of the equation would have to be broken down to the 3 possibilities of black king in corner, on edge, and in interior) white rook = 12092640 56 (white king on e1, kingside castling) * 61 * 60 * 59 = 12092640 54 (white king on e1, both side castling) * 60 * 59 = 191160 56 (black king on e8, queenside castling) * 61 * 60 * 59 = 12092640 56 (black king on e8, kingside castling) * 61 * 60 * 59 = 12092640 54 (black king on e8, both side castling) * 60 * 59 = 191160 (both kings on start squares, both sides can castle to at least one side) = approximately 12101 3000790080 + 16804424448 + 23064148800 + 12092640 + 12092640 + 191160 + 12092640 + 12092640 + 191160 + 12101 = 4.29e10 or just under 36 bits. This is similar to the approximate result I indicated above with 12 bits for the kings and 24 bits for the rooks. At most, the illegal position reductions could drop 1 bit (from 36 bits down to 35 bits), but as was shown, they did not even do that (it dropped about a half bit). As can be seen, the other two methods save considerably (4 to 6+ bits) over including rook positions in with the king positions. The main reason for this is that it takes fewer bits to store a position as a "location bitmap and a type of piece per location" (5+ bits per piece) than it does the nearly 6 bits per piece in a "store each piece via location" method. > >As to savings due to other (than rooks & kings) pieces, yes, of course, >in conjunction with a move generator which watches attacks and a >position scanner which follows the possible direction of >attacks/blocking on the castle-critical squares, knowing that there is >some castle-valid, eliminates numerous possible friendly and enemy >pieces as possibilites on the so scanned squares. Note that if one were >to take advantage of this type of dependencies/constraints, the pieces >need not be specified in full in one pass. Namely, the precise >value/code for pieces may be phased in over several passes as various >constraints (such as castle, ep, checks, promotions etc) are passed on >by the encoder and then imposed by the decoder, in way anlogous to >multipass transmission of pictures on some web pages, where in the >first pass you get a low res version, the next pass this gets refined, >and so on until the full sharp picture is reconstructed after several >passes. The low res passes don't necessarily go to waste (in terms of >wasted bandwidth) since they help constrain and compress better the >image for the next higher res pass (e.g. with quadtree encoding). > > >> Actually, EP can be done even more efficiently by knowing that you can save >> at least 4 bits in an EP position by removing the 2 pawns and then adding 4 >> EP bits. If you have 15 or 16 pawns in the representation of the position, >> then you know to not check for EP since the representation would have only >> had 13 or 14 pawns respectively (due to the algorithm) for the 15 or 16 pawn >> position. In other words, EP takes up no additional space whatsoever (with >> regard to the overall minimum size for a maximum bit intensive position). > >The full ep status encoding never needs more than 4 bits, since there >can be at most 8 ep capture targets. As suggested ath the end of the >earlier ep encoding proposal, once decoder has reconstructed all the >pawn positions, it can index all ep capture candidate targets with >codes 0..MaxEP (where MaxEP is never greater than 8 and, code 0 is >reserved for no-ep-targets symbol). Actually, the decoder in the EP schema I mentioned earlier in this thread does not initially know of any EP pawns (i.e. two of the EP pawns are NOT in the representation of the position at all). The EP decoding still requires 4 bits, however, those 4 bits are only needed in representations where at least 4 bits were already saved due to the 2 pawns being removed from the representation. Also, those 4 bits represent 14 possible capture source and destination combinations and 1 possibility of no EP. For example, the 14 possible EP for white to move would be: a5*b5, b5*c5, c5*d5, ..., g5*h5, h5*g5, g5*f5, ..., c5*b5, b5*a5 Since we know side to move, we know from these 14 states which 2 pawns are missing from the representation. This is a much better way of doing it than yours for the simple reason that IF we have 15 or 16 pawns in the board representation, then we know that we also do not have an EP condition and the extra 4 bits are not needed. Positions with 16 pawns and EP are represented as 14 pawn positions and the 4 bit EP indicator (-4 bits for the 2 missing pawns +4 bits for the EP representation). This means that a 16 pawn (or any number of pawns) position can be represented in the EXACT same number of bits, regardless of whether there is EP. Hence, EP does not require 4 more bits (as in your type of representation). Note: I realize that your representation does not need the 4 extra bits UNLESS there is an EP possibility. However, the worse case scenario (i.e. the largest representation) will be an EP case, hence, the 4 extra bits will be required for SOME positions. For example, if after processing >the pawn placement, the decoder finds 3 ep target candidates (of >whatever color), it will expect a two bit code to tell it which of the >3, if any, is an ep-active target (codes 01,10,11 or code 00 if no >potential target is ep-active). If no ep-capture compatible pawns are >found by the decoder, it doesn't expect any ep related bits to follow, >i.e. 0 bits are used for ep status for such positions. So in a vast >majority of positions you would send fewer than 4 bits needed for a >full worst case ep status. As long as the encoder and the decoder use >the same convention to index the targets and the same multi-phased >position reconstruction, all this can be decoded uniquely. > >(Since you were using Huffman codes, it means you're not requiring the >same number of bits for all positions and the average number of bits >can be much smaller than the worst case number of bits. I am not sure >what was your larger objective, but from some statements it seems as if >it is to minimize the the length of the worst case position, as if >you're looking to fit them all later into the fixed size blocks; while >sometimes it seems you're looking for the best average/overall >encoding. These two objectives are contrary to each other, at least for >the high efficiency regime of compression for both.) The goal is to fit any possible legal random position into 160 bits (this being a good number since it is 20 bytes). > > >> Now, the major problem. This schema does not save a single bit in the worse case >> scenario. The reason is that the a file can be blank, the backrank can be blank, >> the a1-h8 diagonal can be blank and the squares b3 and c2 can be blank. In this >> type of position, there are still 40 squares remaining on the board for the >> other pieces, so there is still plenty of positions that various subsets of 27 >> to 31 other pieces are on the board. Since you can only save space if the pieces >> are in the proper positions, you cannot save any space if they are not. > >Yes, of course, encoding kings plus castle rights separately from other >pieces (in 12 bits) is clearly wasteful. Combining the king pair with >rooks into a common enumeration set removes some of the waste, but >still nearly not enough. The most economical is the encoding using all >pieces in the common enumeration, and using the move generator in the >appropriate phase of the encoding/decoding to eliminate various illegal >positions. Yes, for database representations of MANY positions and of multiple positions from the same game. The only reason I mentioned 2 kings+castle in 12 bits was >because of statement that they need more than 12 bits in general case. >The encoding I suggested fits easily 2 kings plus any castle status in >12 bits, with room to spare. > > >>> Or if one is really ambitious, one >>> may use dictionary methods as the second phase and finally the statistical >>> encoding of the dictionary symbols. The dictionary methods would derive economy >>> from the commonly found clusters of pieces and their typical placements. >> >> Please explain this in more detail. Maybe an example. Thanks. > >I was thinking on ways for improving the average compression, not >necessarily the worst case size i.e. if you had a large database of >positions you wished to compress (while still being able to search for >any exact position), then identifying common subsets of pieces e.g. in >their relative positions (e.g. fianchetto bishop with its pawns, rook & >king) which might often occur in the data base, one would replace >entire such combination with a single symbol. Then, assuming you have >identified the most frequent clusters of pieces, and assigned separate >symbols to them, then the frequencies of these symbols across the >database are used to pick the right Huffman or arithmetic codes for the >final encoding. > >As to how would one go about creating these "frequent clusters", one >quite effective way I had used in various other compression projects >(which many people have thought up independently, including myself, I >had named my own variant a "recursive clustering compressor"), is to >start looking for the most frequent (across the database of positions) >pair of pieces (of whatever color combination) in a given geometric >relation (or in an absolute placement). Once that pair is found, a new >"virtual piece" (or symbol) is assigned to that pair and the database >is "re-written" (conceptually only, in practice this can be done much >faster with linked lists and priority heaps; compressing text files >this way is only slightly slower than pkzip, while giving 15-20% higher >compression ratio) replacing the occurence of such pair with the new >symbol, as if instead of 6+6 types of pieces you had 6+6+1 type of >pieces. Then you repeat the process the same way, look for most >frequent pair, but using 6+6+1 types of pieces as the basic symbols, to >get another new "virtual piece", then "re-write" the data base again >using these 6+6+2 types of pieces, and so on, until no further gain is >obtainable (the gain depends on the symbol encoding methods and how the >dictionaries are passed on, or whether they're counted at all). > >So, in say, a database of middle game positions, the clustering may >first find the most frequent pair (WK+WP in diagonally to the left >front of it, if we use relative postions or, say WK-g1, WP-f2 for >absolute positions dictionary) and make it a new "virtual piece" #12 >(codes 0..11 being used by basic pieces). So it replaces any such >occurence with new piece #12 (with placement given by any convention, >as long as it is the same in encoder & decoder). This new piece may in >the next passe merge with pawn on g3, then this one with pawn on h2, >and this one with bishop on g2, thus forming a dictionary entry >(symbol, virtual piece) for a fiancheto cluster. At the end you may >have few thousands "virtual pieces" which form the dictionary. Then a >data-base may be encoded from scratch with these dictionary entries >using the optimal parsing algorithm adapted to the particular symbol >encoding method (which on text compression using this method saves >another 5-10% of space, but it uses 2-3 times more time than the rest >of the algorithm so in in some cases this phase may not be suitable for >performance reason). > > >The position compression is one application of "virtual piece" concept. >The playing programs (primarily the knowledge based ones, such as those >of Botvinik type, where I saw this idea, although he didn't use term >"virtual pieces" or assign values to them) would benefit as well in >representing the chess knowlede in terms of such pieces. These virtual >pieces are created and destroyed like normal pieces (which are created >via promotion and destroyed via capture), and have their own typical >values, just like the elemental pieces. They operate/live not within >the lower layer alpha-beta seracher tree (although they can be cludged >there too, like bishop pair and such), but in the high level >sparse/human-like planning/resoning tree. The virtual pieces >encapsulate and score effectively the idea of piece cooperation and >typical maneuvers against typical targets (bishop pair is a well known >one, rooks & open lines another, but there are hundreds more of varying >importance). Obviously, they cannot be extracted from a database using >a simple static frequency pairing as those used for compression. To >extract the most useful such virtual pieces one can use an analysis >module which considers the joint attacks or joint movements over a span >of moves in a high quality game (or if the CPU time is a non-issue, >perform searches with various pieces removed and identify what is the >minimum cluster needed for a given evaluation improving sub-sequence of >moves from the game). Some clusters may be even manually extracted by >the strong chessplayers. This is very interesting, even if it is a mechanism to handle a totally different goal. I like it. I will have to look at it in more detail some day. KarinsDad :)
This page took 0.01 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.