Author: Robert Hyatt
Date: 13:49:15 10/23/01
Go up one level in this thread
On October 23, 2001 at 11:01:40, Ratko V Tomic wrote: >>I don't see how. I will never probe them most likely. >> Because once I get a TB "hit" I don't search down that >> branch any further, period. > >Well, not in that evaluation. But in the hundreds of millions >of evaluations with a limited number of pieces (which may >have few hundred millions distinct arrangements all together) >the odds of hitting the "wasted" piece arrangement later are >certainly much greater than the odds of hash table hits in >the middle game. Your cutoff argument would equally imply >that the middle-game hash tables are useless. Actually they are pretty useless except for move ordering. Hardly any hash hits with EXACT values happen in the middlegame. Although there are a fair number of <= alpha or >= beta type cutoffs. But what would this have to do with searching the position _below_ a hash entry and then storing that as well? It won't get hit if the current entry is hit first. And if it isn't... I am simply saying that the EGTBs are huge. 6 piece files are going to be beyond 2 gigabytes with a pawn, or with some piece combinations. You aren't going to store any great part of that in memory anyway, and stomping thru 100 reads to find the distance-to-mate seems horrible, as it will take you from the 6 piece file, to the 5 piece file, to the 4-piece file, etc... depending on how the fastest mate is carried out. If EGTB's are put on a CD, they absolutely kill performance. A CD isn't 100x slower than a good SCSI disk. Doing 100X the I/o certainly is 100X slower, and would be unbearable. > The point is >that you can arrive to the same position may different ways, >especially for the kind of material on the board the >tablebases cover. Not as often as you think. The hash table and killer moves guide the search down similar pathways whenever possible. Which means at least you will be more likely to hit the original entry than the other 100 "seeded" entries. And if you store all 100 "seeded" entries you are going to blow out the hash table in another way... > >Note also that fetching a chunk of table from the disk, to >perform what seems to program "one lookup," via file read >or the "transparent" memory paging by the OS, will take more >time than few hundreds simple RAM/CPU based calculations to >deduce the same value. That's why compressed disks work >often faster than uncompressed -- a single extra wait of >few milliseconds for a disk spin to fetch a page pays for >millions of computation steps of the decoder, which in turn >covers much greater quantity of data than the page being >fetched. What do you do about the 100X larger hash table requirement? If you store every probe result, you get buried. If you don't, you get buried. In short... you get buried in a lot of overhead. > >>>Considering also that using the best-move you may be able to >>>fit couple times more table information on any given disk >>>or RAM size you have available, >> >> All the TBs so far use 8 bit values. So there is no savings >> on the "best move" approach until we get to the 6 piece files >> with mates > 127... > >Well, no saving if you encode the best move in a byte. With the >right pattern preprocessor/dictionary extractor and move generator >the best move ought to fit on average in 1 bit or less. The best >move table can easily take advantage of any endgame rules or >knowledge, human or computer extracted patterns, to make the >move generator produce the best-move in many fewer tries than >a random move generator would do (which is the most of a >pattern that Huffman or similar memoryless stream compression >could squeeze out). I don't see how the "best" move can fit into "one bit or less". If you have to have large external dictionaries, etc, then the cost of that has to be included as well. I think that practically, an 8 bit index is the minimum that can be _safely_ used, because we know there are over 128 possible moves in many positions, but so far, never more than 255, so that 8 bits will work for all cases. This won't save a thing until the deeper mates in 6 piece files overflow to 16 bits (or 12 bits or whatever is considered reasonable). But the overhead... I/O is horribly slow. I would _never_ do 100 now, on the hopes that I will use a few of them again later. The right way is "always put off until later that which you can do without now, because later might never come if you get a good cutoff." > >With distance-to-mate tables the only compression available is >of that 'memoryless stream' type, since the values in these tables >are result of multi-step iterations of highly nonlinear functions, >so for all practical purposes any simple connection to the >patterns in the initial positions is lost by the time you reach >the terminal node and get the distance value. The only regularity >left which is readily extractable are the bulk frequencies of >different distance values. The rest of pattern is oblitareted. > This is only true for current implementations. The "encoding" could be changed to group draws and non-draws closer together, resulting in better compression. But theoretically that is possible, practically it is another thing. >The best-move tables are more similar to gradient methods of >optimization which are sensitive to the local patterns. The move >generator used in the best-move table decoder can easily and >beneficially (for compression ratio) absorb any amount of >knowledge/regularity in data one can come up with. "easily and beneficially" doesn't lead directly to a "how" or implementation. I don't see an efficient way of doing such a generator. And until that critical step is completed, even the theoretical advantage of such a scheme will be unrealizable.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.