Author: Rolf Tueschen
Date: 04:53:24 06/04/02
Go up one level in this thread
On June 04, 2002 at 07:11:21, Sune Fischer wrote: >On June 04, 2002 at 06:22:14, Rolf Tueschen wrote: > >>>No. I mean that the program is not going to play the same way each time. There >>>will be patterns and deficits and all sorts of interesting quirks with its play. >> >>Of course so! But because of that you would talk about indeterminism? You >>shouldn't. > >Yes, this is what we mean by indetermanism. >Take the Zobrist table, if you do not initiate that with a random seed but use >the clock generated one, then you will generate different keys at each search >and so the hashtable saves different results and that will affect the search. >Not _very_ random I guess, but not completely *determanistic* either. My knowledge about machines is not sufficient to debate that but I think that this is strange. Why should a machine play at random if it once "thought" it has found the best play. Would you say that even if the perfect way has been found, that then still some random cha nges could have even better results? I don't get this. > >>>GM's performances against computers recently have been rather dissapointing. >>>Quite frankly, I am not sure who is exposing the weaknesses of the other more. >> >>Here my thought experiment could give you the right clue/ idea. > >Yes but we do not have the money to buy enough GM playing time needed for the >10000 games, so it you can forget that idea I'm afraid. Excuse me, that was a fantasy of someone other. I don't think that we need 10 000 games. What we need are opponents who don't play human chess against machines. That's all. > > >>>>But what is excellent in your eyes? And where is a program 100 points higher >>>>than another in SSDF, I fear I do not get it. >>>>The 64% is _not_ saying anything. And you can never say +100 in SSDF. Guess why? >>>>(Calibration!) >>> >>>The SSDF data is carefully calibrated against itself. There is no other meaning >>>to calibration in the ELO system. >> >>That was your best for the past weeks I think! *g* >> > >Ok, if this is too complicated we can simplify it. >Say you take two football teams, France and Brazil and let them play eachother >100 times. Say the score is 45-55, that gives us a measure of how much stronger >Brazil is than France (if at all). This is "selfcalibration" if you want to call >it that. It tells you nothing about how France would score against Germany, but >for an internal reference in the pool it is very good. >However if Germany played Brazil, the theory goes that we could use that result >to pridict the outcome of France - Germany. > >We do not have GMs in the pool of computers, so the SSDF can only be used as an >internal reference, noone disputes that I think. We have a different understanding of calibrating then. Oh my God! This is simply nonsense, excuse me so much. By definition you can't calibrate France against Brazil and then claim that you are now thinking that Brazil is as good as the gorillas in the jungle. Know what I mean? As to your predictions, I agree that we could always be sure about the result in matches between different "classes", say 2002 giants against 1999 winners. But for 2202 giants for each other we do not know because of the ridiculously high margins. And then a little reflection. Our predictions could also come out of games itself, but not from the ranking list. That is my point. Of course only if someone knows about computerchess and the games. > > >>> You simply do not understand how it works. >>> >> >>I can't help myself but I must agree, I never before heard of a calibrating >>against itself. > >see above :) > > >> >>>>>The SSDF results do not predict how the machines will do against >>>>>people. >>>> >>>>And why? *g* >>>>Because the results are not valid, they have no meaning. >>> >>>If you think that the results are invalid, then you don't understand what the >>>SSDF is or what they are trying to do. The math is (without question) far >>>better than most attempts at this sort of thing and the data is much more robust >>>than (for instance) that of FIDE. The error bars are dependable and the results >>>will be repeatable. I think perhaps your difficulty with the data is that they >>>do not represent what you would like them to mean. That is not an error in the >>>experiment but (rather) a design choice. Of course, you can always make your >>>own experiment and gather the dozens of volunteers needed to complete the >>>experiments. >> >>Dann, this is leading us to our final judgement day. It is here, just now. I >>don't dream of my own test you bet. The SSDF has a wonderful team at hand. Why >>should I intervene? The only thing I want is, that they understand that the >>actual results are bogus. And when I say bogus I mean absolute bogus, not just a >>slight touch of bogus. Because we have no validity. Means, that the 2600 "Elo" >>is a complete fata morgana, and I take for granted that you know the scale of >>human Elo. For reasons unknown to me SSDF seems to make their list similar to >>the human list. Oh yes! But the validation had been lost in the meantime. So we >>have the numbers, but no meaning. In short that is bogus. > >Like Dan said, it is not bogus but you need to understand what it does show. >It cannot be compared to the FIDE scale, and nobody is saying it can. >It may be close, we suspect that I guess, but we do not really know. Would you tell me why the numbers look so suspiciously close to some human rankings? I mean if we take the performances of machines in show events. Hint, we're not talking about the surface of SSDF but about the "real" strength of machines and humans. > >>But what means Elo 2650 at SSDF??? Just this little question to begin with. Do >>you have any idea? > >Well what does 2345 FIDE mean? - it is a number on a scale that is used to >compare humans. I agree. And you would conclude that the SSDF numbers must be something similar? Why? Where was this calibrated and where was is made valid? Rolf Tueschen >The SSDF is a scale to compare computers, nothing more. > >-S.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.