Author: Rolf Tueschen
Date: 17:26:06 05/31/02
Go up one level in this thread
On May 31, 2002 at 18:03:47, Bertil Eklund wrote: >Answer to Rolf Tueschen >Here is my (slightly forced) answer. > Thank you for your answers. I beg your consent that I won't answer to the many ad hominems from your side. So that now we could have a fair dispute IMO so far. >># 1) SSDF equips each time the new programs with the fastest hardware. Do we >>find out this way if the new engine is stronger than the older? No! Quite simply >>because the old engines could be as strong or stronger on new hardware. >> >Usually the "best" engines are played on both new and old hardware. > That is not the exactitudiness I would like to hear from SSSDF the only serious test ranking list of the world. What is usually. Who defines when it should be done? >># 2) What's a match for between a (new) program and an old program, which is >>weaker in all 4 factors from above? How we could find out, which factor in the >>new program is responsible for the difference in strength? We couldn't know! > >If you and other reactionary people had been in charge we still should have used > extremely limited books and programs with new learning. We should also wait a >year or so until enough "new" programs are out to compete on the new hardware. >Do you also think Kasparov shouldn't play against an opponent 100 elo weaker >than himself. Do you have an idea of how the ELO-system works? Did you know that >you can calculate the ratings both when you play against an opponent 30 elo >above your rating or 150 elo below your rating? Obviously not. Just to show you that I'm still reading with care I might add that I always thought that Elo could only be calculated if one is playing as much games as possible against oneself! > >># 3) If as a result one program is 8 "Elo points" stronger, how could we know, >>that this is not caused by the different opponents? We couldn't know. > >Now we can't but it is much more exact, in general than a humans rating that >maybee plays 40 games a year,and in the same town against the same opponent >several times. I have not the slightest idea of your concept of exactitudiness in SSDF. What do you mean exactly? > >># 4) How could we know, if the result with a difference of 8 points won't >>exactly turn around the rank of each two pairs of programs after some further 20 >>games each? We couldn't know that. > >Now we can't. So what?! Try to compare with the human ELO-list. The only thing >we know is that, the human list is much more uncertain. Is that the new Magna Charta of SSDF? You don't even have the necessary big pool human lists do! So far about "certainty". The only thing that is really certain in SSDF is the deadline, that is what I've learned thanks to your answers. > >># 5) SSDF is not suppressing games of a match, however is moving a match with >>only 5 games into the calculation of the Elo numbers and is continuing the rest >>of the match for the next publication. How could we know, that this effect does >>not influence the result of the actual edition? We couldn't know! > >Of course it influence the results in some way or another. Did you know that it >is deadlines for the human list too. Of course, Bertil! But I ain't ever heard of the news that performances had been sent by express to the Elo people uhmm -- - when the tournament is still going on. Excuse the the act of cruelty, please. > >># 6) SSDF often matches newest progs vs ancient progs. Why? Because the >>variability of the choice of the opponent is important for the calculation of >>Elo numbers? Hence Kasparov is playing against a master player of about Elo >>2350? Of course not! Such nonsense is not part of human chess [as necessity of >>Elo numbers!]! Or is it that the lacking validity of the computer should be >>replaced by the play against weakest and helpless opponents? We don't know. > >All new programs play against a pool of one or two dozens of programs, could be >more than Kasparov! All programs plays against its predecessor (if any). Are you >sure that it is better to play against an opponent 150 elo weaker than you then >an equal opponent. Do you understand the ELO-system? No, I like equal opponents. But I have difficulties to follow the SSDF concept of strength because I haven't seen it yet. You must be aware of the fact that you are testing without validity. So, my question, what is strength for you? BTW you didn't mention the autoplayer e.g. from ChessBase. Could I be confident that you did some research on the tool? >># 7) Why SSDF is presenting a difference of ranks of 8 points as in May 2002 or >>earlier even of 1 point, if the margin of error is +/- 30 points and more? Is it >>possible to discover a difference between each programs at all? No! SSDF is >>presenting differences, which possibly do not exist in real because they can't >>be defined account of the uncertainty or unreliability of the measurement >>itself. So, could we believe the SSDF ranking list? No. [Not in its presented >>form.] > >So? If the difference between program A and B (in the above example) are less >than 60 elo the result shouldn't be presented. Until your conversation with me about my propositions, yes, of course. > >># 8) SSDF is publishing only results, is implying in short commentaries what >>next should be tested, but details about the test design remain unknown. What >>are the conditions of the tests? We don't know. > >You know that we answer all such questions personally or here or in another >forums. Excuse me, no, I didn't know that. Is it really secret? > >># 9) How many testers SSDF actually has? 10 or 20? No. I have confidential >>information that perhaps a handful of testers are doing the main job. Where are >>all the amateur testers in Sweden? We don't know. > >What's the problem if it is 5, 10 or 15 testers. Is it better if it is 20 or >maybee 24. Not necessarily, I was just talking about the old days of them testings... > >>This list of questions could be continued if necessary. > >>So, what is the meaning of the SSDF ranking list? Perhaps mere PR, because the >>winning program or the trio of winners could increase it's sales figures. >>Perhaps the programmers themselves are interested in the list. We don't know. > >The only meaning is the one that you can't understand the pure love and interest >in computer chess. Can you maybee remember the time when the only buying advices >was the advertisements from in example Fidelity or extremely blind persons like >a few in this forum. Or a lot of renowned persons here that believe that the >best program wins the "computer-chess" WM (the same persons that also claims >that they understands statistics). I have great respect for you and the SSDF. BTW that was topic in part one of Mosaiksteinchen 8. I repeat, I have great respect for the tradition of SSDF. Unfortunately the statistics sucks because there's no validity. And 40 games make no sense too. But for the old days of hand-played testing I have great respect! So, if I had to decide, I would start a reformation (!) of SSDF at the instant. > >>[Actually this ranking is unable to answer our questions about strength.] > >>[You could read my whole article (number 8) in German at >>http://members.aol.com/mclanecxantia/myhomepage/rolfsmosaik.html] > >Hopefully I should try it but for personal reasons I am very busy for the >moment. > >Bertil Let's keep in touch. GENS UNA SUMUS Rolf Tueschen
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.