Author: Rolf Tueschen
Date: 07:41:39 02/20/03
Go up one level in this thread
On February 20, 2003 at 10:32:04, Tony Hedlund wrote: >On February 18, 2003 at 16:11:43, Rolf Tueschen wrote: > >>On February 18, 2003 at 13:20:19, Tony Hedlund wrote: >> >>>On February 17, 2003 at 17:56:02, Rolf Tueschen wrote: >>> >>>>On February 17, 2003 at 13:36:28, Tony Hedlund wrote: >>>> >>>>>On February 17, 2003 at 09:05:31, Rolf Tueschen wrote: >>>>> >>>>>>On February 17, 2003 at 06:53:14, Uri Blass wrote: >>>>>> >>>>>>>On February 17, 2003 at 06:29:23, Rolf Tueschen wrote: >>>>>>> >>>>>>>>On February 16, 2003 at 13:21:39, Tony Hedlund wrote: >>>>>>>> >>>>>>>>>On February 15, 2003 at 07:12:10, Rolf Tueschen wrote: >>>>>>>>> >>>>>>>>>>On February 15, 2003 at 05:24:43, Tony Hedlund wrote: >>>>>>>>>> >>>>>>>>>>>On February 14, 2003 at 16:27:31, Rolf Tueschen wrote: >>>>>>>>>>> >>>>>>>>>>>>On February 14, 2003 at 13:32:16, Tony Hedlund wrote: >>>>>>>>>>>> >>>>>>>>>>>>>On February 14, 2003 at 09:27:26, Rolf Tueschen wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>>On February 14, 2003 at 08:43:12, Bob Durrett wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>Excellent points. The "bottom line" is that SSDF presented their findings >>>>>>>>>>>>>>>properly, but the problem is in interpretation. SSDF cannot be held responsible >>>>>>>>>>>>>>>for errors in interpretation. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>Bob D. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>Wrong conclusion. I tried to explain the points but apparently it's a bit too >>>>>>>>>>>>>>difficult. In short : If you use a system of statistics you are not allowed to >>>>>>>>>>>>>>make your own presentation. The presentation by SSDF is FALSE. That is the >>>>>>>>>>>>>>point. False and unallowed. Instead of 1., 2., 3., they should say 1.-3., not >>>>>>>>>>>>>>should, but must, if the differences in the actual results are way smaller than >>>>>>>>>>>>>>the error in the tests itself. Is that impossible to understand? >>>>>>>>>>>>>> >>>>>>>>>>>>>>Rolf Tueschen >>>>>>>>>>>>> >>>>>>>>>>>>>Then the right presentation is: >>>>>>>>>>>>> >>>>>>>>>>>>>1-10 Shredder 7 2801-2737 >>>>>>>>>>>>>1-10 Deep Fritz 7 2789-2732 >>>>>>>>>>>>>1-11 Fritz 7 2770-2711 >>>>>>>>>>>>>1-2? Shredder 7 UCI 2761-2638 >>>>>>>>>>>>>1-15 Chess Tiger 15 2753-2700 >>>>>>>>>>>>>1-15 Shredder 6 Pad UCI 2750-2703 >>>>>>>>>>>>>1-16 Shredder 6 2750-2689 >>>>>>>>>>>>>1-19 Chess Tiger 14 2744-2684 >>>>>>>>>>>>>1-19 Deep Fritz 2741-2680 >>>>>>>>>>>>>1-19 Gambit Tiger 2 2739-2681 >>>>>>>>>>>>>3-2? Junior 7 2715-2659 >>>>>>>>>>>>>4-2? Hiarcs 8 2707-2657 >>>>>>>>>>>>> >>>>>>>>>>>>>and so on. >>>>>>>>>>>>> >>>>>>>>>>>>>Tony >>>>>>>>>>>> >>>>>>>>>>>>Thanks for the fine joke, Tony. Perhaps you lay your figer into the wound! >>>>>>>>>>>>You want to have a number one, right? Then you make tests, just like you do, >>>>>>>>>>>>fair and correct. And then you come into the period where you must evaluate your >>>>>>>>>>>>results. You see that you have no clear umber one. Now two possibilities: >>>>>>>>>>>> >>>>>>>>>>>>1) You go on into decisive mode and do further tests, the "list" date can wait. >>>>>>>>>>>> >>>>>>>>>>>>2) You stay to your traditions and show up with your list. But then, please, do >>>>>>>>>>>>NOT present the list either in the classical way, nor in your joking Mr. Bean >>>>>>>>>>>>version, but simply make such packages: >>>>>>>>>>>> >>>>>>>>>>>>1.-3. A B C >>>>>>>>>>>>4.-5. D E >>>>>>>>>>>>6. F >>>>>>>>>>>>7.-10. G H I >>>>>>>>>>>>etc. >>>>>>>>>>>> >>>>>>>>>>>>Tell me please, where the problem is with this method? >>>>>>>>>>> >>>>>>>>>>>Why just three strongest engines? With the margin of errors Gambit Tiger 2 could >>>>>>>>>>>be as strong as the other top engines. I find Mr. Bean's version more logic then >>>>>>>>>>>yours. Could you please explain your method further. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>SSDF has good statistics experts. Consult these experts and you will understand >>>>>>>>>>why Gambit Tiger 2 could NOT be number one. My first three was a pool where all >>>>>>>>>>could be number one. Only Shredder 7 UCI could be included, but my example was >>>>>>>>>>more a demonstration of such a list. It's not MY method. It's simply what >>>>>>>>>>careful researchers would do if they had your results. Perhaps you don't know >>>>>>>>>>it, Tony, but the presentation of the results must have a base in the results. >>>>>>>>> >>>>>>>>>What do you propose SSDF do exactly? Give me a clear example of how you would >>>>>>>>>present the data. Don't give me this A, B and C. You have the result, wich >>>>>>>>>programs are A, B and C? >>>>>>>>> >>>>>>>>>>In other words it might well be that one day you will have a clear number one. >>>>>>>>> >>>>>>>>>The bottom line is that when we reach a margin of error close to zero, then we >>>>>>>>>can claim a number one? When will that happen? After 10 000 games by each >>>>>>>>>entrance? >>>>>>>>> >>>>>>>>>>Or do you believe that your method guarantees the eternal status quo? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>Is it because you have >>>>>>>>>>>>kind of strong wish to present a umber one by all means? >>>>>>>>>>> >>>>>>>>>>>Do you also think that FIDE shouldn't have a number one on there list? Is >>>>>>>>>>>Kasparov really the best player? >>>>>>>>>> >>>>>>>>>>Please do not seek for outside help, when you run out of arguments in favor of >>>>>>>>>>your own presentation. >>>>>>>>> >>>>>>>>>FIDE, ICCF and SSDF all have a ratinglist. And we all use professor Arpad Elo's >>>>>>>>>metod of measure strenght in chess. And yes I argue for our way of presentation. >>>>>>>>>ICCF's number one Ulf Andersson have played 25 games! Figure the margin of error >>>>>>>>>there. They probably don't have any careful researchers. >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>Please let's simply >>>>>>>>>>>>discuss this little topic. If you tell me, listen, Rolf, I am not allowed to >>>>>>>>>>>>tell you, but you are right, that a umber one prog is very important for us. >>>>>>>>>>> >>>>>>>>>>>It seem to be more important to others. >>>>>>>>>> >>>>>>>>>>Yes, that was my deeper assumption. Could you give more details? >>>>>>>>> >>>>>>>>>Details? >>>>>>>>>People here at CCC seem to be looking forward for our next list, to see wich is >>>>>>>>>number one. And then they congratulate the programmer. And of course the >>>>>>>>>commercials use it in there advertisement. As they always has. When we started >>>>>>>>>our list, it was as a complement to our reviews for new programmes. >>>>>>>>>Personally I'm not interested in wich program is number one. I'm more interested >>>>>>>>>in how the different engines are playing. >>>>>>>> >>>>>>>>I can well imagine your personal sentiments and I have great respect for your >>>>>>>>efforts with SSDF as a whole but you can't stop history's progress. When you >>>>>>>>played move by move with the ancient chessboards your dedication and hard work >>>>>>>>was really sensational and people got results for their virgin background. Today >>>>>>>>- with autoplayed games - you have more time to do sound statistics. However, if >>>>>>>>simply the top programs do not differ that much then you can't call out a number >>>>>>>>one. Or you play millions of games. But who guarantees you that then you will >>>>>>>>have a clear first? No - you should accept the actual reality. And that is >>>>>>>>equality among the top entries. >>>>>>>> >>>>>>>>You are misleaden if you think that the thankfullness of the CC users was linked >>>>>>>>with your presentation of a number one. It was because of your general efforts >>>>>>>>to the best of CC. >>>>>>> And the business world at that time was very coloured. But >>>>>>>>today we have a single important company. Do you want to do your job for them >>>>>>>>and their marketing interests or for the users around the world? You must >>>>>>>>accept that if statistically you have no clear first then you can't present a >>>>>>>>number one program. >>>>>>> >>>>>>>Number one only means leading it does not mean best. >>>>>>>I do not see what is your problem with it. >>>>>>> >>>>>>> >>>>>>> What does that bother you??? You are independent! But >>>>>>>>independent does not mean naive.Why don't you consider the consequences of such >>>>>>>>strange events: Fritz8 is out for months and you don't test it. I read that you >>>>>>>>wait until ChessBase will send you a copy. But that then would no longer speak >>>>>>>>for your independent tests. Because factor time of testbeginning always was a >>>>>>>>factor. All such dangers and difficulties you could avoid with sound statistics >>>>>>>>and certain basic guidelines. You must become independent of such marketing >>>>>>>>decisions by ChessBase. >>>>>>> >>>>>>>I do not see what is the problem with waiting for chessbase to send the program. >>>>>>>It is not that they do everything that chessbase tell them and >>>>>>>I believe that if chessbase ask them not to test programs of another company >>>>>>>like Tiger they will not do it. >>>>>>> >>>>>>>I believe that they should test only if programmers ask them otherwise they may >>>>>>>waste time on testing the wrong versions and they will have no computer time >>>>>>>to test the right versions. >>>>>>> >>>>>>>They did not test a lot of programs and Fritz8 is not alone. >>>>>>>They did not test Movei and hundreds of free programs and I see no reason that >>>>>>>testing Fritz8 is more important when the programmer did not ask them to do it. >>>>>>> >>>>>>>Note that I did not ask them to test Movei and I do not complain(Maybe I will >>>>>>>ask them in the future when Movei will be significantly better). >>>>>>> >>>>>>>Note also that testing Fritz8 is more important than testing Movei if both >>>>>>>programmers ask them to do it but if chessbase do not ask them to do it then >>>>>>>buying Fritz8 in order to test it may be a waste of time because they will >>>>>>>have no time to test stronger Fritz. >>>>>>> >>>>>>>I think that the customers may also be intereted in the rating of Fritz that >>>>>>>chessbase send them because I believe that the customers will get the same Fritz >>>>>>>as an update and if the ssdf waste time now on testing Fritz8 they will have no >>>>>>>computer time to test the upgrade that chessbase may release. >>>>>>> >>>>>>>Uri >>>>>> >>>>>> >>>>>>You have interesting views on independance. Please come into CTF so that we can >>>>>>talk about Israel. What you say is unacceptable from the point of independant >>>>>>testings. You don't believe it, but then you have no knowledge about the >>>>>>neccessities of statistics. It's not a moral or such, it's a must! Otherwise the >>>>>>results are NOT independant and you can trash SSDF. >>>>> >>>>>What you are saying is, since our number one is a program from Chessbase then we >>>>>can't be independent. If Ruffian was number one this thread wouldn't have >>>>>started, would it? >>>> >>>>No, where did I say such a nonsense? Please learn English before you make such >>>>conclusions. I think I know what you are doing here. Instead of answering >>>>http://www.talkchess.com/forums/1/message.html?284772, what you _couldn't_, you >>>>step in here [what is normally no problem, but here it _is_ a problem!] without >>>>exact understanding for the language of a message and try to stir confusion. The >>>>reason why you do that is clear. You know that you have no justification for >>>>your presentation of a number "one" and you see ccritics, so there is a single >>>>possibility and that is stirring confusion, so that the reader should hear you >>>>saying: "well, you know this is Rolf, what could he have to say? We, the SSDF, >>>>are in the business for decades!" But all such doctoring does NOT change the >>>>fact that you have no base for the presenting of Shredder 7 as "number one". >>> >>>It seems to me that you are running out of arguments, and so the insults starts. >> >>It's the other way round. I gave my strongest argument that you must create >>confusion (it's only Rolf, it's only against ChessBase), because you have no >>base (statistically) for the presentation of a number one. And sure - you still >>have no arguments. Therefore you now invent a new confusion, namely that I would >>'insult'. Could you tell me where I insult? Where exactly? > >Above you write "Please learn English before you make such conclusions." And in >the end you write "Again, please try to learn English before you step in other >people's debates." I maybe to sensitive, but accusing me to not understand >English is insulting. Yes, because you take everything at face value. How could I, with my weak English invite you to learn better English. Hint => Joking. And that is proven. But I leave it here. This message will prove that are no game at debates. You simply prefer to make _your_ jokes and leave most of the questions aside. This message here does prove that have no answers for most of my questions. Just take a look for yourself. Well, I will call it arrogance again. Then you will reply "but he's insulting". And exactly that is the defamation. Like Sune Fischer. Butr answers, you have not. Rolf Tueschen > >>Why should I insult >>you, you have never done me wrong in the past, other to your collegues Bertil >>and Peter F. No, I declare that I had no reason to insult you and would never do >>that. For me this is here more a psychological topic. I ask myself why such a >>decent person like yourself suddenly go into such a mode of larmoyance. >> >>We all here, me included, respect you in SSDF for the huge work you've done over >>the decades. When I had the possibility to ask my questions in 1996, I was so >>happy, after so many years I had followed your list. But from the beginning I >>observed incredibly weak reactions. I will never forget the expression for >>critics, namely "member of the Czub Anti-SSDF gang [sic!!]". That is ridiculous >>for me because I had my questions right from my education in university studies >>and mathematics. Suddenly I was accused, defamed to be a member of a gang! That >>was in 1996. >> >>In the meantime I published so many faults in your methodology and always the >>main reply was "we are amateurs, not scientists". >> >>Let me give you the probably most serious argument against your test methods. >>You always argue that FIDE has Elolists, and you want to imply that your list >>would just be the same or at least similar. I object. For very basic reasons. >>Elo for human players has data a) for thousands of players and b) for thousands >>of games for each player. > >I very much doubt that. Where can I find this data? > >>Many players will have a record over the period of 30 >>years and more. > >= One generation. > >>The databases of publicly known games is about 2,5 million >>games. - > >Rated games? I don't think so. > >> >>Now let's take a look what _you_ have. No insult meant, Tony, honestly. >> >>You know like I do, that you have modern "players" [program versions] with a >>life of 12 months > >= One generation > >>on possible different hard-ware. You always claim that you >>have a database of 60000 games. > >A database of 16000 games. But 90000 rated games played. > >>To exploit that pool you always declare that >>therefore a modern program MUST also be paired with a rather antique program. > >You mean that a new entrance must play against an entrance with an established >rating. > >>Then you claim that validity is assured through some 30 games of Swedish players >>20 years ago... > >We calibrated our first lists with 337 games played against swedish players >1987-1991. See: http://home.interact.se/~w100107/level.htm > >> >>You know what I know? I can tell you. With such conditions you have no base for >>a reasonable list. You have 5 or 10 programs each season that are comparable. >>YOu have no justification to start the tests always with a number of 1500 or >>such because the new version has ZERO Elo. > >I agree. A new entrance have no Elo. > >>And now you construct with imbreeding >>technology GM results. With Elo all this has nothing to do. > >Don't mix up Fide-elo with SSDF-Elo. > >>You have no history >>in your ranking. What you have is the artificial combining of representatives of >>differet species from different historic pasts. But these "representatives" have >>surprisingly no own history in the developments of hard-ware for instance. But >>you don't remark the basic fault. I explained it may times. If you use different >>hard ware you can't test the strengths of programs. > >How come? Shredder X A1200 is one entrance and Shredder X K6-2 450 is another. >I thought _that_ was trivial. > >>Nobody in SSDF understood >>this although it is a very trivial argument or truth. >> >>NB the difference to human Elo numbers. Look: Smyslov once was a World Champion, >>right? He is still playing today.But his performance is down to 2450 or >>something. But this is because of his age. Let's now take a look into SSDF >>former World leading programs. Excuse me, Tony, I have no data about the early >>results, but back in time MEPHISTO III surely was a good program. Or MChess 1. >>Just take a prog out of that time. Why does such a program no longer play >>today??? Why don't you test MChess 1 on Pentium IV??? That is what you should do >>among other things. But what you do in reality is this: You become not tired to >>test the newest versions of the company's progs. You have no interest for the >>where-abouts of your earlier favorits, it's as if you all threw them into the >>bin. And that makes your list so artificial and false! >> >>I know, that you could say that it makes no sense to let Mchess 1 play because >>1) we had MChess7 and 2) there is no sense in letting MChess1 play on P4. >> >>But if that is the case then you should admit that you could NOT compare your >>"list" with the human Elolist. > >Is that your problem!! Then I fully agree. You _can't_ compare the SSDF-Elolist >with the FIDE-Elolist. > >>Tony, I invite you to think about all this - if you have time. Let's discuss >>this in a friendly atmosphere. Perhaps we can find a new base for SSDF. > >Rolf, we already seem to have come to an understanding. > >Tony > >>> >>>>>>You are giving your personal opinions and nobody is allowed to attack you so far >>>>>>but what is if you simply had no idea what is going on here? You have no >>>>>>understanding for the meaning of average terms embedded in daily speech. You say >>>>>>but they only tell us who is leading! That doesn't mean that he's the best. But >>>>>>Uri, that is NOT the point at all. The point is that they cannot conclude that >>>>>>someone is leading with these 8 points and a margin of 30 on both sides. >>>>> >>>>>But we can! >>>> >>>>No, you can't! - Of course you can do what you want. Next time you could present >>>>X as new number one with 1 point advantage and 60 points of margin. >>> >>>Exactly! >> >>Inyour own interest you should reconsider that opinion. >> >> >> >>> >>>>>As you pointed out earlier, and I quot "SSDF has good statistics >>>>>experts". >>>> >>>> >>>>Did I say that? Yes, often I like irony. >>> >>>So now it was irony? >> >>Of course. It was clear because everybody knows my critic of your false >>methodology. It's here in the archives and also on my homepage. See: >>http://hometown.aol.de/rolftueschen/rolftueschenmosaik.html >> >> >> >> >>> >>>> >>>> >>>>> >>>>>>You >>>>>>have no idea what that exactly means! >>>>> >>>>>Speak for yourself. >>>> >>>>Sure, that is what I always do! I am famous for it and therefore certain >>>>interested groups don't like me. But what is your business here? Uri and I have >>>>a communication for months now and you seem to feel envy? >>> >>>Running out of arguments? You said to me, and I quot "Please let's simply >>>discuss this little topic." So I was under the impression that this thread was >>>between us. >> >>yes. But here I was addressing Uri as you can see here below. You stepped i here >>but you didn't answer the other message I made. >> >> >> >>> >>>>>>So then you can well talk about "Let them >>>>>>do what they do, they are not doing something wrong"! Uri, they are so wrong, >>>>>>more than your own Prime Minister! Because they do something very special: >>>>>> >>>>>>They say that Shredder7 is the new number one, the new leader as you say. And >>>>>>they give these margins! Together that means: Folks, we have no clear result for >>>>>>place one! And I argue against the mistakes. But here in CCC experts behave as >>>>>>if the margins would make the overall verdict ok, because the experts know what >>>>>>margins mean. I translate: experts are saying that a lie is not a lie as long as >>>>>>the experts have a possibility to see whats really going on. >>>>> >>>>>YOU say it's a lie. That's your opinion, not a fact. >>>> >>>> >>>>Again, please try to learn English before you step in other people's debates. I >>>>did NOT say what you believe here. >>> >>>More insults? Other people's debate? You said, and I quot "But here in CCC >>>experts behave as if the margins would make the overall verdict ok, because the >>>experts know what margins mean. I translate: experts are saying that a lie is >>>not a lie as long as the experts have a possibility to see whats really going >>>on." >> >>Yes, and that is the truth.I read more than once that experts here said that >>possible errors in SSDF were of no importance because the experts knew how what >>was meant. Interesting because the list is published in chess journals where >>thousands of users read it, users without expert status. So this is not a honest >>debate. IMO. >> >> >> >> >> >>> >>> >>>>> >>>>>>But the lack of >>>>>>respect for the dumb users is well allowed, because that is business. >>>>> >>>>>We have respect for the users, it's for them we are doing the list. But we have >>>>>no respect for DUMB users. >>>> >>>>Oh well, that will be a candidate for the quote of the year! >>>> >>>> >>>> >>>>> >>>>>>Against >>>>>>that confusion I say, no no, SSDF is responsible because THEY annouced new >>>>>>number 1! >>>>> >>>>>Yes Rolf, SSDF is responsible for having a number 1 in the list. >>>> >>>>Yes, and that is why I criticised the faults of SSDF. Namely presenting a number >>>>one that is not number one. >>> >>>But it is number one, within the margin of errors. >> >> >>No! Within the margins you have no way to know who is first of the three progs. >> >> >> >>> >>>>I think a good analogy is this: you write a message >>>>here with "Tony" and you supply a photo that is showing a man with _green_ hair. >>>>Then in the header line you say "Tony" ("see photo, the man with the red [sic!] >>>>hair"). Then Rolf writes a critic and shows that green hair is not the same as >>>>red hair. Then Tony writes a message "we in SSDF have a long experience and >>>>never before users criticised us for the presentation of wrong-colored hair; >>>>only dumb users like Rolf have a problem with the difference between red and >>>>green hair; in Sweden the two colors are the _same_!!! We in SSDF also have many >>>>good color experts." >>>> >>>>:) >>> >>>Thanks for the fine joke, Rolf. >> >> >>Do you take jokes as personal insults? Please let's not go into that mode. I >>have great respect for you. And that does not change if you support errors in >>the SSDF list. I think we can discuss this and hope that it could be changed. As >>long as you don't call me names or make open insults, I try too give friendly >>opinions. >> >>Rolf Tueschen >> >> >> >> >>> >>>Tony >>> >>>>Rolf Tueschen >>>> >>>>> >>>>>Tony >>>>> >>>>>>Rolf Tueschen
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.