Author: Bertil Eklund
Date: 01:25:51 08/27/02
Go up one level in this thread
On August 26, 2002 at 18:13:56, Rolf Tueschen wrote: >On August 26, 2002 at 07:54:49, Uri Blass wrote: > >>On August 26, 2002 at 07:21:58, Rolf Tueschen wrote: >> >>>On August 26, 2002 at 06:45:03, Uri Blass wrote: >>> >>>>On August 26, 2002 at 06:18:11, Rolf Tueschen wrote: >>>> >>>>>On August 25, 2002 at 12:00:06, Peter Fendrich wrote: >>>>> >>>>>>On August 25, 2002 at 07:59:54, Kurt Utzinger wrote: >>>>>> >>>>>>>Please have a look at >>>>>>> >>>>>>>http://ccc.it.ro/search/ccc.php?art_id=217174 >>>>>>> >>>>>>>Regards >>>>>>>Kurt >>>>>> >>>>> >>>>>============================================================================= >>>>>>These tables are not accurate at all for the lines covering only few games. >>>>>============================================================================= >>>>> >>>>>So, the tables are not correct (for the cases when you only have few, very few >>>>>games!), "because" the tables require normal distribution. So far so good. Now >>>>>you are argueing, let's take binominal or trinominal, and then we could get rid >>>>>of the limitations when you have very few cases (like in SSDF)? I hope I had no >>>>>language interferences? >>>> >>>>ssdf usually play hundreds of games with every program so I do not see the only >>>>few games problem. >>> >>>Excuse me, but I see it. How many hundreds of games they play, that could be >>>added up? >> >>Here is the list of the programs above 2600. >>You can see that the porgrams played usually more than 400 games > >Yes, Uri, I knew it. But! I wrote "that could be added up". Didn't you know that >the newest progs also play neandertal (M. Scheidl)? How could they even suggest >that we accept such a nonsense. The term validity comes into play. I repeat the >question. What do they measure?? Learning function? Or book? This is so trivial >for someone who knows what is important in statistics. But this is all a >repetition. I think in May 2002 I had already explained all that. Only - you >will notice that the SSDF team doesn't answer the exact questions. Not that it >matters because I'm talking about undeniable facts. This is not my personal >liking or my idea or wishful thinking. > > >> >>1 Fritz 7.0 256MB Athlon 1200 MHz 2741 30 -29 574 64% 2636 >>2 Shredder 6.0 Paderb 256MB Athlon 1200 2727 34 -32 467 65% 2619 >>3 Chess Tiger 14.0 CB 256MB Athlon 1200 2721 33 -32 487 63% 2627 >>4 Gambit Tiger 2.0 256MB Athlon 1200 2718 31 -30 523 60% 2645 >>5 Shredder 6.0 256MB Athlon 1200 MHz 2717 32 -31 505 64% 2618 >>6 Deep Fritz 256MB Athlon 1200 MHz 2716 33 -32 491 63% 2622 >>7 Junior 7.0 256MB Athlon 1200 MHz 2689 29 -29 593 58% 2632 >>8 Rebel Century 4.0 256MB Athlon 1200 MHz 2684 33 -32 475 63% 2586 >>9 Hiarcs 8.0 256MB Athlon 1200 MHz 2671 28 -28 624 55% 2638 >>10 Shredder 5.32 256MB Athlon 1200 MHz 2669 30 -30 538 57% 2622 >>11 Gandalf 4.32h 256MB Athlon 1200 MHz 2652 34 -33 430 54% 2624 >>12 Deep Fritz 128MB K6-2 450 MHz 2651 23 -23 959 61% 2571 >>13 Gandalf 5.0 256MB Athlon 1200 MHz 2642 49 -50 202 46% 2674 >>14 Gambit Tiger 2.0 128MB K6-2 450 MHz 2641 29 -28 634 66% 2525 >>15 Gandalf 5.1 256MB Athlon 1200 MHz 2638 26 -26 707 55% 2601 >>16 Junior 7.0 128MB K6-2 450 MHz 2632 25 -25 815 65% 2524 >>17 Chess Tiger 14.0 CB 128MB K6-2 450 MHz 2629 28 -27 667 62% 2543 >>18 Shredder 6.0 UCI 128MB K6-2 450 MHz 2627 55 -54 168 57% 2581 >>19 Fritz 7.0 128MB K6-2 450 MHz 2625 41 -41 294 53% 2604 >>20 Fritz 6.0 128MB K6-2 450 MHz 2619 21 -21 1110 61% 2541 >>21 Crafty 18.12/CB 256MB Athlon 1200 MHz 2613 30 -29 561 53% 2593 >>22 Shredder 5.32 128MB K6-2 450 MHz 2605 28 -27 639 58% 2547 >> >>> >>> >>>> >>>>> >>>>>Without agitation let me make this very clear. Any attempt to show something >>>>>reasonable out of only very few cases (like in SSDF) is a myst. The limitations >>>>>out of very few cases is absolutely given. There is no way or "trick" to heal >>>>>that. >>>>> >>>>>There is only one single remedy and that is the higher number of cases. And >>>>>therefore the actual practice of SSDF is meaningless. And no adding would help >>>>>you out of this mess since you are presenting over 30000 games but these games >>>>>come from totally incomparable entities. But you could have known this before. >>>>>The adding of games in human chess is a completely different process. >>>>> >>>>>BTW let me repeat the question where you take the validity from in SSDF. What do >>>>>you measure? And how did you find control mechanisms? >>>>> >>>>>Also interesting could be where the similarities in Swedish ELO and human chess >>>>>ELO are coming from? Is this decided by definition? When was it done? >>>>> >>>>>Rolf Tueschen >>>> >>>>The list is calculated also based on games of humans against old computers. >>> >>>Tournament games? Do you know details about the very few games then? I think we >>>are talking about a myst, excuse me. Most if not all games were published in PLY. >> >>You can download 14738 of their games in >>http://home.interact.se/~w100107/welcome.htm but unfortunately I do not find >>comp-human games there. >> >> >>You can find list of human-calibaeration results from 1987-1991 when 24 old >>programs played against humans and got rating based on average number of game >>that is slightly more than 10 games for program but unfortunately there are >>only results and no games when chris carason games do not include the games that >>they talk about. > >That isn't even the most interesting thing. I take it for granted that they >played these games. But. You can't take some 20 masters from Sweden and let them >play a few skittles. This is not calibration. It's a joke. Do you think that >"masters" had something to fear from commercial progs? I don't think so. >Had they knowledge of the progs? Training? Interest at all? Incentive? Money? >Where are the data from these events. The evidence. It doesn't work like that! >You can't take some old master who has still 2450 in the lists and then put him >in front of a program. And then you take the results as a proof for the strength The games are from the Swedish Championships and other tournaments in respective class and the results was included in the tournament. I can asure you that most players focused the most on the game against the computer. Isn't it a bit strange that you almost always are wrong about facts?! How can you believe that we can believe you in other "subjective" accusations, questions, hocus-pocus or just nonsense. >of the machine. Uri - I know that you are participating in Israel's >championships and therefore you know that this is not realistical what happens >in such skittles. Where nothing is at stake. The computer side takes masters to >get Elo numbers! It isn't kosher to say the least. It is surely _not_ >calibrating. The absence of the game scores is absolutely uninteresting when we >are talking about skittles. And here the argument that SSDF is _not_ about >science, but it's a private hobby, is _not_ acceptable. You see how you >understood calibration! But without calibration and validity you have nothing >but results and performances. But not Elo numbers comparable to human chess. > >BTW all this is _not_ a question of intelligence. Even the most intelligent >people could be cheated with statistics. Because if you once rely only on your >natural human estimation you must forcably miss the statistical tricks. With >stats you can prove that toothbrushs cause the birth of babies. And with SSDF I >can prove that FRITZ has 3000 ELO. :) > >> >>see http://home.interact.se/~w100107/level.htm for list of the programs that >>played against humans and their rating based on the games. > >Skittles. Shows. Fun. > > > >>> >>> >>>> >>>>The rating of the good programs in the list were too high so they decided 1 or 2 >>>>years ago to reduce the rating of all programs by 100 elo to make the rating of >>>>the programs in the top of the list more realistic against humans. >>> >>>And now the height is ok? How did you prove it? >> >>I did not prove it but I think that most people agree that 2841 for Fritz7 on >>A1200 is at least 100 elo too high so reducing the number by 100 elo reduce the >>difference relative to humans. > >Uri, Uri! I drives tears in my eyes to see you argue so carefully. But you are >already intoxicated. Please subtract 300 Elo numbers and then we can start the >debate. Just my opinion. Other numbers are completely unrealistic. Or did you >ever see events over a longer period of time, at tournament level, and with real >money at stake? And most of all, did you see fair rules? The rules are still >coming from the old days when progs were no real opponents. And also this. You >know exactly that progs at one time are very good and in others they are weak >like beginners. I don't mean blunders, I mean misunderstanding very basic chess >concepts. It's still a mess. > >Please please do not even think for a second that I have a lack of respect for >you. Would I write such articles if I had? > >Rolf Tueschen > Bertil
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.