Author: Ratko V Tomic
Date: 09:06:48 11/01/99
Go up one level in this thread
>>bits for the fully occupied board (without captures, promotions, castle or ep >>status bits) is 82+64=146 bits... > > >Different dispositions of pawns: 48! / (32!*8!*8!) > >Different dispositions of remaining pieces: 48! / (32!*(2!)^6) >(Here 48 = 64-16 are the squares not occupied by pawns) > >Total dispositions: (48!/(32!*8!))^2 / 64 = 2.14 10^40 ===> 134 bits > You are right, I had ignored all position legality constraints for to get quickly an upper bound for the entropy of the full board. But, as your calculation shows, just removing the primary illegal placements for the pawns saves 12 bits for the full board, without perceptibly complicating the calculations. The 134 bits is still an upper bound (since other illegal configurartions, such as checks of the wrong side, positions without legal predecessor, etc are ignored), but a much tighter one, making it more likely to be able to fit all positions in 160 bits. Now with 160-134=26 spare bits, that goal seems quite feasable. > I hope to give an help to your arguments, but the problems arise with > promotions and the worst case is when 12 promotions (if 12 is the max) > are equally divided among White and Black and make almost equal the > number of Q, R, N and B in each side (e.g. 4, 3, 3, 3). Yes, that seems to be roughly the worst case, although I would switch the 4 piece count from Queen to Knight (i.e. from the highest mobility to the lowest mobility piece) since that would produce fewer possibilities for an illegal position due to illegal checks, thus it would result in more legal positions. One would need a program to go through all possible material variations and add the numbers of distinct permutations for them (since this doesn't seem like a problem suitable for manual calculations; even without promotions, the captures added 236195 new material balance cases to the full board one; the positions with promotions might increase this tenfold or hundredfold). As to the encoding strategy, suppose you have an algorithm to generate a list of all legal positions, we'll denote the number of legal positions NLP. The number of bits needed to index any position in the list is NB=[log2(NLP)]+ (where []+ denotes rounding up to the next higher integer, i.e. the ceiling() in C). This is the optimal worst case encoding. Any other encoding, which uses for some positions fewer bits than NB will automatically use for some other positions more bits than NB, thus lenghtening the worst case code. (Although NLP & NB refer to the ideal case of strictly legal positions, this conclusion about the worst case sub-optimality for variable length codes hold even if you ignore some types of illegality, as long as you ignore the same types of illegality in both methods, the raw enumeration method and the variable length code method). Therefore for this particular objective (seeking the shortest encoding for the worst case position/s) one would have to avoid temptation of shortening needlessly the codes for positions which don't exceed the 160 bits. Even more counter-intuitively, whenever one finds a type of position which codes into a fewer than 160 bits, one would purposefully look for ways to change the encoding method to _increase_ the length of these short codes (of course, without adding any superficial redundancy) to get as close as possible to the 160 bits. The farther below 160 some type of position is, the more it needs lenghtening. This goes against the grain, as it were, when one is in the "compressing" mind-set to make onself think up of ways to lenghten the encoding. For example, when we were looking the king pair+castle encoding and got it down to 12 bits, we should have been looking instead for some non-redundant encoding method which would stretch it into 160 bits. Now, that's an interesting goal. The more evenly the average codes approach 160 bits from below, the better opportunities will emerge for the worst case positions to fit in 160 bits. Since we don't know actually whether the 160 is the right target (it may be too low or too high), the more general strategy might be to try to vary the encoding methods aiming to minimize the difference between the highest and the lowest cases known at a given stage.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.