Author: Russell Reagan
Date: 21:21:44 03/02/05
Go up one level in this thread
On March 02, 2005 at 18:59:18, Axel Schumacher wrote: >Hi all, >I have two question regarding the storage requirements for information; I hope >somebody can help me with answering them. Please excuse if my questions are >stupid. > >1. For each data-point (e.g. let's say the position of a pawn on the chessboard) >one requires 1 bit (either 0 or 1). Right? However, the information does not >include where the pawn is located. So, how much data has to be stored to >describe e.g. the position of a pawn? Practically speaking, if you only want to store the location of a single pawn, you can do that in 6 bits. 6 bits will hold 64 different values. Theoretically speaking, you must also store information that maps each of the 64 6-bit values to each of the 64 squares on a chess board. For example, "a pawn is on square 28," doesn't mean anything by itself. We need to know that square 28 is e4 (or whatever). However, the right answer depends on more details. Do you want a practical or theoretical answer? Do you only want to store the location of one pawn, or many pawns, or many types of pieces? Do you want to store data about one position, or about many positions (like a database)? Do you want to consider illegal positions with more than 16 pawns? Do you want a solution that works well with sparsely populated boards, or densely populated boards, or a solution that handles both well? Are you more interested in the average case or the worst case? A good answer will depend upon details like these. >2. How much calculation power is need to calculate a certain amount of data? I >know, this this may sound a little bit abstract and, of course, it depends on >the time-factor. But let's you have 1 terabyte of data in a data-spreadsheet. >What calculation-power (e.g. amount of average desktop computers) is needed to >make simple algebraic calculations with such a data-table? Here also we need more details. How many things does the one terabyte describe (i.e. how many things have to be processed)? At what rate can a desktop computer process the data? 1 entry per hour? 10 million entries per second? What exactly needs to be done? Do you need one final result (average, sum, etc.), or do you need to keep the results for all entries? Does all of the data have to be processed? Or are you looking for something like a maximum or minimum? If so, maybe we could skip some work. Depending upon exactly what you want to know, different algorithms will perform better, and that also will depend on more details. If the data is sorted, that could help (depending upon what we're searching for). Do you need an exact answer, or only an estimate? Do you need to prove that the exact answer or estimate is correct, or is a "pretty good guess" okay? A good answer will depend upon details like these. >I hope sombody can help me with this. >I'm writing a paper in which I make an analogy from biostatistic calculations >with chess and calculations in chess (e.g. from a typical chess program). The >reason for this is to examplify how biological data can be stored and how it can >be interpreted. In this special case we are dealing with 3.6 x 10^14 raw data >points deriving from chemical modifications in the human genome (so called >epigenetics). For example, is a specific DNA base in the genome methylayted or >not we have the state 0 or 1 again (plus this data has to be referenced). These >information-units could interact in an infinite number of ways, so that it seems >that it impossible to make sense out of them. However, IMHO, the analogy with >the game of chess exemplifies that it still should be feasible to approach the >problem of complex genetic information. In chess, a small number of rules can >generate a huge number of board configurations (states), which are analogous to >the configurations of molecules obeying physiological laws. Chess is known to >have also an infinite number of possible combinations in its play but in theory >the number is finite, since specific positions are impossible, as not all >(epi)genetic factors can be found in all functional working combinations. E.g. >it is said that in chess ‘merely’ ~10^43 to 10^50 states (positions) are needed >to to describe the state (or the game) of the system. Out of these subsets of >possible states, patterns can be established and calculated. So it is not >necessary, to know every possible state. It is obvious that pure reductionism, >the theory that all complex systems can be completely understood in terms of >their components, may not be a fully fruitful approach. >Yet, recent development in the field of complexity (e.g. statistical mechanics) >has come up with alternative statistical approaches. It considers the average >behaviour of a large number of components rather than the behaviour of any >individual component, drawing heavily on the laws of probability, and aims to >predict and explain the measurable properties of macroscopic systems on the >basis of the properties and behaviour of their microscopic constituents. Chess >programs don't rely on brute force alone anymore. Maybe such 'pattern >recognition' or reduction of legal states can help in making sense out of >complex data. >Your opinion? Answers to the qustions? :-) > >Axel
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.