Author: Rolf Tueschen
Date: 06:32:38 02/20/03
Go up one level in this thread
On February 20, 2003 at 09:12:18, Tony Hedlund wrote: >On February 18, 2003 at 16:22:58, Rolf Tueschen wrote: > >>On February 18, 2003 at 12:53:52, Tony Hedlund wrote: >> >>>On February 17, 2003 at 06:29:23, Rolf Tueschen wrote: >>> >>>>On February 16, 2003 at 13:21:39, Tony Hedlund wrote: >>>> >>>>>On February 15, 2003 at 07:12:10, Rolf Tueschen wrote: >>>>> >>>>>>On February 15, 2003 at 05:24:43, Tony Hedlund wrote: >>>>>> >>>>>>>On February 14, 2003 at 16:27:31, Rolf Tueschen wrote: >>>>>>> >>>>>>>>On February 14, 2003 at 13:32:16, Tony Hedlund wrote: >>>>>>>> >>>>>>>>>On February 14, 2003 at 09:27:26, Rolf Tueschen wrote: >>>>>>>>> >>>>>>>>>>On February 14, 2003 at 08:43:12, Bob Durrett wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>Excellent points. The "bottom line" is that SSDF presented their findings >>>>>>>>>>>properly, but the problem is in interpretation. SSDF cannot be held responsible >>>>>>>>>>>for errors in interpretation. >>>>>>>>>>> >>>>>>>>>>>Bob D. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>Wrong conclusion. I tried to explain the points but apparently it's a bit too >>>>>>>>>>difficult. In short : If you use a system of statistics you are not allowed to >>>>>>>>>>make your own presentation. The presentation by SSDF is FALSE. That is the >>>>>>>>>>point. False and unallowed. Instead of 1., 2., 3., they should say 1.-3., not >>>>>>>>>>should, but must, if the differences in the actual results are way smaller than >>>>>>>>>>the error in the tests itself. Is that impossible to understand? >>>>>>>>>> >>>>>>>>>>Rolf Tueschen >>>>>>>>> >>>>>>>>>Then the right presentation is: >>>>>>>>> >>>>>>>>>1-10 Shredder 7 2801-2737 >>>>>>>>>1-10 Deep Fritz 7 2789-2732 >>>>>>>>>1-11 Fritz 7 2770-2711 >>>>>>>>>1-2? Shredder 7 UCI 2761-2638 >>>>>>>>>1-15 Chess Tiger 15 2753-2700 >>>>>>>>>1-15 Shredder 6 Pad UCI 2750-2703 >>>>>>>>>1-16 Shredder 6 2750-2689 >>>>>>>>>1-19 Chess Tiger 14 2744-2684 >>>>>>>>>1-19 Deep Fritz 2741-2680 >>>>>>>>>1-19 Gambit Tiger 2 2739-2681 >>>>>>>>>3-2? Junior 7 2715-2659 >>>>>>>>>4-2? Hiarcs 8 2707-2657 >>>>>>>>> >>>>>>>>>and so on. >>>>>>>>> >>>>>>>>>Tony >>>>>>>> >>>>>>>>Thanks for the fine joke, Tony. Perhaps you lay your figer into the wound! >>>>>>>>You want to have a number one, right? Then you make tests, just like you do, >>>>>>>>fair and correct. And then you come into the period where you must evaluate your >>>>>>>>results. You see that you have no clear umber one. Now two possibilities: >>>>>>>> >>>>>>>>1) You go on into decisive mode and do further tests, the "list" date can wait. >>>>>>>> >>>>>>>>2) You stay to your traditions and show up with your list. But then, please, do >>>>>>>>NOT present the list either in the classical way, nor in your joking Mr. Bean >>>>>>>>version, but simply make such packages: >>>>>>>> >>>>>>>>1.-3. A B C >>>>>>>>4.-5. D E >>>>>>>>6. F >>>>>>>>7.-10. G H I >>>>>>>>etc. >>>>>>>> >>>>>>>>Tell me please, where the problem is with this method? >>>>>>> >>>>>>>Why just three strongest engines? With the margin of errors Gambit Tiger 2 could >>>>>>>be as strong as the other top engines. I find Mr. Bean's version more logic then >>>>>>>yours. Could you please explain your method further. >>>>>> >>>>>> >>>>>>SSDF has good statistics experts. Consult these experts and you will understand >>>>>>why Gambit Tiger 2 could NOT be number one. My first three was a pool where all >>>>>>could be number one. Only Shredder 7 UCI could be included, but my example was >>>>>>more a demonstration of such a list. It's not MY method. It's simply what >>>>>>careful researchers would do if they had your results. Perhaps you don't know >>>>>>it, Tony, but the presentation of the results must have a base in the results. >>>>> >>>>>What do you propose SSDF do exactly? Give me a clear example of how you would >>>>>present the data. Don't give me this A, B and C. You have the result, wich >>>>>programs are A, B and C? >>>>> >>>>>>In other words it might well be that one day you will have a clear number one. >>>>> >>>>>The bottom line is that when we reach a margin of error close to zero, then we >>>>>can claim a number one? When will that happen? After 10 000 games by each >>>>>entrance? >>>>> >>>>>>Or do you believe that your method guarantees the eternal status quo? >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>>>Is it because you have >>>>>>>>kind of strong wish to present a umber one by all means? >>>>>>> >>>>>>>Do you also think that FIDE shouldn't have a number one on there list? Is >>>>>>>Kasparov really the best player? >>>>>> >>>>>>Please do not seek for outside help, when you run out of arguments in favor of >>>>>>your own presentation. >>>>> >>>>>FIDE, ICCF and SSDF all have a ratinglist. And we all use professor Arpad Elo's >>>>>metod of measure strenght in chess. And yes I argue for our way of presentation. >>>>>ICCF's number one Ulf Andersson have played 25 games! Figure the margin of error >>>>>there. They probably don't have any careful researchers. >>>>> >>>>>>> >>>>>>>>Please let's simply >>>>>>>>discuss this little topic. If you tell me, listen, Rolf, I am not allowed to >>>>>>>>tell you, but you are right, that a umber one prog is very important for us. >>>>>>> >>>>>>>It seem to be more important to others. >>>>>> >>>>>>Yes, that was my deeper assumption. Could you give more details? >>>>> >>>>>Details? >>>>>People here at CCC seem to be looking forward for our next list, to see wich is >>>>>number one. And then they congratulate the programmer. And of course the >>>>>commercials use it in there advertisement. As they always has. When we started >>>>>our list, it was as a complement to our reviews for new programmes. >>>>>Personally I'm not interested in wich program is number one. I'm more interested >>>>>in how the different engines are playing. >>>> >>>>I can well imagine your personal sentiments and I have great respect for your >>>>efforts with SSDF as a whole but you can't stop history's progress. When you >>>>played move by move with the ancient chessboards your dedication and hard work >>>>was really sensational and people got results for their virgin background. Today >>>>- with autoplayed games - you have more time to do sound statistics. However, if >>>>simply the top programs do not differ that much then you can't call out a number >>>>one. Or you play millions of games. But who guarantees you that then you will >>>>have a clear first? No - you should accept the actual reality. And that is >>>>equality among the top entries. >>> >>>That's why we have the margins of error. So the intelligent users can make that >>>interpretation. >>> >>>>You are misleaden if you think that the thankfullness of the CC users was linked >>>>with your presentation of a number one. It was because of your general efforts >>>>to the best of CC. And the business world at that time was very coloured. But >>>>today we have a single important company. Do you want to do your job for them >>>>and their marketing interests or for the users around the world? You must >>>>accept that if statistically you have no clear first then you can't present a >>>>number one program. What does that bother you??? You are independent! But >>>>independent does not mean naive.Why don't you consider the consequences of such >>>>strange events: Fritz8 is out for months and you don't test it. I read that you >>>>wait until ChessBase will send you a copy. But that then would no longer speak >>>>for your independent tests. >>> >>>We also wait for a new version of Yace and some copies of CM9000. >>> >>>>Because factor time of testbeginning always was a >>>>factor. All such dangers and difficulties you could avoid with sound statistics >>> >>>We already have sound statistics. It's your OPINION that we don't. >>> >>>>and certain basic guidelines. You must become independent of such marketing >>>>decisions by ChessBase. >>> >>>Yes we depend on getting free copies of prgrams since we dont have the economy >>>to buy copies to all our testers. >> >>Since we have a very open and friendly debate, please could you answer two >>points? >> >>1) Tell me what you think about the message by Mogens Larsen! Please. > >Could you be more specific? http://www.talkchess.com/forums/1/message.html?284841 Rolf Tueschen > >>2) Let's break a taboo, Tony. Tell me how many testers you have. I have serious >>information that it's not higher than 5. Is this correct? > >No. > >>Let's face reality. >>When I take your published games then I detect only three authors. > >You should detect four. > >>So what does >>it mean when you talk about "testers". > >8-10. > >Tony > >>Rolf Tueschen >> >> >>> >>>>Don't ask me for the details. I am not a member and I was defamated long enough >>>>by your collegues in the staff. >>>> >>>>Rolf Tueschen >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>>> >>>>>>Rolf Tueschen >>>>>>> >>>>>>>>Then, Tony, I am out of the debate, because I had great respect for your amateur >>>>>>>>approach. Comps are not cheap either. etc. To make it clear. I would not oppose >>>>>>>>sponsering. But if you said, but Rolf, look, we have a real number one! That is >>>>>>>>the exact result of our statistics. - Then however, I will continue to ask >>>>>>>>polite questions. >>>>>>> >>>>> >>>>>The exact result of our statistics is the way Mr. Bean interpret the list. >>>>>You choosed not to comment on this, why? >>>>> >>>>>Tony >>>>> >>>>>>> >>>>>>>>Rolf Tueschen
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.