Author: Sune Fischer
Date: 04:11:21 06/04/02
Go up one level in this thread
On June 04, 2002 at 06:22:14, Rolf Tueschen wrote: >>No. I mean that the program is not going to play the same way each time. There >>will be patterns and deficits and all sorts of interesting quirks with its play. > >Of course so! But because of that you would talk about indeterminism? You >shouldn't. Yes, this is what we mean by indetermanism. Take the Zobrist table, if you do not initiate that with a random seed but use the clock generated one, then you will generate different keys at each search and so the hashtable saves different results and that will affect the search. Not _very_ random I guess, but not completely *determanistic* either. >>GM's performances against computers recently have been rather dissapointing. >>Quite frankly, I am not sure who is exposing the weaknesses of the other more. > >Here my thought experiment could give you the right clue/ idea. Yes but we do not have the money to buy enough GM playing time needed for the 10000 games, so it you can forget that idea I'm afraid. >>>But what is excellent in your eyes? And where is a program 100 points higher >>>than another in SSDF, I fear I do not get it. >>>The 64% is _not_ saying anything. And you can never say +100 in SSDF. Guess why? >>>(Calibration!) >> >>The SSDF data is carefully calibrated against itself. There is no other meaning >>to calibration in the ELO system. > >That was your best for the past weeks I think! *g* > Ok, if this is too complicated we can simplify it. Say you take two football teams, France and Brazil and let them play eachother 100 times. Say the score is 45-55, that gives us a measure of how much stronger Brazil is than France (if at all). This is "selfcalibration" if you want to call it that. It tells you nothing about how France would score against Germany, but for an internal reference in the pool it is very good. However if Germany played Brazil, the theory goes that we could use that result to pridict the outcome of France - Germany. We do not have GMs in the pool of computers, so the SSDF can only be used as an internal reference, noone disputes that I think. >> You simply do not understand how it works. >> > >I can't help myself but I must agree, I never before heard of a calibrating >against itself. see above :) > >>>>The SSDF results do not predict how the machines will do against >>>>people. >>> >>>And why? *g* >>>Because the results are not valid, they have no meaning. >> >>If you think that the results are invalid, then you don't understand what the >>SSDF is or what they are trying to do. The math is (without question) far >>better than most attempts at this sort of thing and the data is much more robust >>than (for instance) that of FIDE. The error bars are dependable and the results >>will be repeatable. I think perhaps your difficulty with the data is that they >>do not represent what you would like them to mean. That is not an error in the >>experiment but (rather) a design choice. Of course, you can always make your >>own experiment and gather the dozens of volunteers needed to complete the >>experiments. > >Dann, this is leading us to our final judgement day. It is here, just now. I >don't dream of my own test you bet. The SSDF has a wonderful team at hand. Why >should I intervene? The only thing I want is, that they understand that the >actual results are bogus. And when I say bogus I mean absolute bogus, not just a >slight touch of bogus. Because we have no validity. Means, that the 2600 "Elo" >is a complete fata morgana, and I take for granted that you know the scale of >human Elo. For reasons unknown to me SSDF seems to make their list similar to >the human list. Oh yes! But the validation had been lost in the meantime. So we >have the numbers, but no meaning. In short that is bogus. Like Dan said, it is not bogus but you need to understand what it does show. It cannot be compared to the FIDE scale, and nobody is saying it can. It may be close, we suspect that I guess, but we do not really know. >But what means Elo 2650 at SSDF??? Just this little question to begin with. Do >you have any idea? Well what does 2345 FIDE mean? - it is a number on a scale that is used to compare humans. The SSDF is a scale to compare computers, nothing more. -S.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.