Author: Dan Newman
Date: 02:16:08 07/03/00
Go up one level in this thread
On July 02, 2000 at 06:04:50, TEERAPONG TOVIRAT wrote: >>Yes, sure, you are right, and I think I do understand it quite well. >>I just wanted to say, that the resulting approximate formula for the >>error probability depends on N and H, and not on P (IMO). > >Thanks for your interest to my post. No one besides you seems to do so. >Perhaps it's too difficult for me to express the third factor (P/H ) in >detail(English). >Let me try. Do you notice N and H are relatively constant or controlled >by limitation of RAM ? In my hypothesis I base on assumption that every >hash value has approximately equal number of positions,but I think in >the actual situation is different. I don't know about the real distribution. >I know only a lot of people here say " not all random numbers are suitable >for hash value some of them leading to clash so often than others" . >So, the third factor should be the real number of positions that have the >same hash value not just by approximation. And now if my hypothesis >is correct. Will it be useful? As I said above we cannot control N,H. >I think we can detect a good series of random numbers by using these facts. >In last few days I spent hours in experiment about this. I test my checkers >program >with 32 bit hash table. The results are quite the same as I expect. >At the early stage of game the incidence of the type 1 error is so small. >As the number of positions occupied on the table increase the incidence >also increases until the table is fully occupied the incidence tend to be >stable at some constant. My current problem is my incidence of error is >higher than I expect. I got 0.5% instead of 1/4000. Do you think I can blame >it on my random integers? or Do you have any idea to generate a good series >of random number? >I'm sorry it's quite a long post. > >Thanks, >Teerapong I ran some tests a couple of years ago and got about one type-1 error per second (roughly) at 45,000 hash probes per second with only 32 bits of hashcode. This works out to 1/45000 or 0.002%. I think this is lower than your 1/4000 number because the positions that are visited in a search are not randomly chosen but are generally closely related to each other. That is, when you do get a hashcode match it's very much more likely that the positions are the same. If the positions were truly chosen at random, then attempted hits on the hash table would almost always fail, and the rare success (hashcodes match) would almost always be a type-1 error. Selecting the random numbers can be fairly tricky. Many C library random number generators have a very short repeat cycle (typically 64k). Many of the usual techniques for fast random number generation give you numbers with a high degree of non-randomness in the low order bits (typically the lowest order bit just toggles on and off as you generate successive "random" numbers). The next lowest order bit has a slightly more complex toggling pattern and so on. If you construct your 64-bit random numbers by joining two 32-bit random numbers in pairs, you can end up effectively losing a bit (or more) because of strong correlations (or anti-correlations) between bits within the numbers. Also, you want the numbers to sufficiently different from each other. That is, you don't particularly want numbers that are expressible as combinations (with XOR) of small numbers of the other random numbers--otherwise it becomes too easy to get type-1 errors. It's also good to have your own random number generator (in C or whatever) so that you can get the same set of RNs when you move from one platform or compiler to another. -Dan.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.