Author: Uri Blass
Date: 04:54:49 08/26/02
Go up one level in this thread
On August 26, 2002 at 07:21:58, Rolf Tueschen wrote: >On August 26, 2002 at 06:45:03, Uri Blass wrote: > >>On August 26, 2002 at 06:18:11, Rolf Tueschen wrote: >> >>>On August 25, 2002 at 12:00:06, Peter Fendrich wrote: >>> >>>>On August 25, 2002 at 07:59:54, Kurt Utzinger wrote: >>>> >>>>>Please have a look at >>>>> >>>>>http://ccc.it.ro/search/ccc.php?art_id=217174 >>>>> >>>>>Regards >>>>>Kurt >>>> >>> >>>============================================================================= >>>>These tables are not accurate at all for the lines covering only few games. >>>============================================================================= >>> >>>So, the tables are not correct (for the cases when you only have few, very few >>>games!), "because" the tables require normal distribution. So far so good. Now >>>you are argueing, let's take binominal or trinominal, and then we could get rid >>>of the limitations when you have very few cases (like in SSDF)? I hope I had no >>>language interferences? >> >>ssdf usually play hundreds of games with every program so I do not see the only >>few games problem. > >Excuse me, but I see it. How many hundreds of games they play, that could be >added up? Here is the list of the programs above 2600. You can see that the porgrams played usually more than 400 games 1 Fritz 7.0 256MB Athlon 1200 MHz 2741 30 -29 574 64% 2636 2 Shredder 6.0 Paderb 256MB Athlon 1200 2727 34 -32 467 65% 2619 3 Chess Tiger 14.0 CB 256MB Athlon 1200 2721 33 -32 487 63% 2627 4 Gambit Tiger 2.0 256MB Athlon 1200 2718 31 -30 523 60% 2645 5 Shredder 6.0 256MB Athlon 1200 MHz 2717 32 -31 505 64% 2618 6 Deep Fritz 256MB Athlon 1200 MHz 2716 33 -32 491 63% 2622 7 Junior 7.0 256MB Athlon 1200 MHz 2689 29 -29 593 58% 2632 8 Rebel Century 4.0 256MB Athlon 1200 MHz 2684 33 -32 475 63% 2586 9 Hiarcs 8.0 256MB Athlon 1200 MHz 2671 28 -28 624 55% 2638 10 Shredder 5.32 256MB Athlon 1200 MHz 2669 30 -30 538 57% 2622 11 Gandalf 4.32h 256MB Athlon 1200 MHz 2652 34 -33 430 54% 2624 12 Deep Fritz 128MB K6-2 450 MHz 2651 23 -23 959 61% 2571 13 Gandalf 5.0 256MB Athlon 1200 MHz 2642 49 -50 202 46% 2674 14 Gambit Tiger 2.0 128MB K6-2 450 MHz 2641 29 -28 634 66% 2525 15 Gandalf 5.1 256MB Athlon 1200 MHz 2638 26 -26 707 55% 2601 16 Junior 7.0 128MB K6-2 450 MHz 2632 25 -25 815 65% 2524 17 Chess Tiger 14.0 CB 128MB K6-2 450 MHz 2629 28 -27 667 62% 2543 18 Shredder 6.0 UCI 128MB K6-2 450 MHz 2627 55 -54 168 57% 2581 19 Fritz 7.0 128MB K6-2 450 MHz 2625 41 -41 294 53% 2604 20 Fritz 6.0 128MB K6-2 450 MHz 2619 21 -21 1110 61% 2541 21 Crafty 18.12/CB 256MB Athlon 1200 MHz 2613 30 -29 561 53% 2593 22 Shredder 5.32 128MB K6-2 450 MHz 2605 28 -27 639 58% 2547 > > >> >>> >>>Without agitation let me make this very clear. Any attempt to show something >>>reasonable out of only very few cases (like in SSDF) is a myst. The limitations >>>out of very few cases is absolutely given. There is no way or "trick" to heal >>>that. >>> >>>There is only one single remedy and that is the higher number of cases. And >>>therefore the actual practice of SSDF is meaningless. And no adding would help >>>you out of this mess since you are presenting over 30000 games but these games >>>come from totally incomparable entities. But you could have known this before. >>>The adding of games in human chess is a completely different process. >>> >>>BTW let me repeat the question where you take the validity from in SSDF. What do >>>you measure? And how did you find control mechanisms? >>> >>>Also interesting could be where the similarities in Swedish ELO and human chess >>>ELO are coming from? Is this decided by definition? When was it done? >>> >>>Rolf Tueschen >> >>The list is calculated also based on games of humans against old computers. > >Tournament games? Do you know details about the very few games then? I think we >are talking about a myst, excuse me. You can download 14738 of their games in http://home.interact.se/~w100107/welcome.htm but unfortunately I do not find comp-human games there. You can find list of human-calibaeration results from 1987-1991 when 24 old programs played against humans and got rating based on average number of game that is slightly more than 10 games for program but unfortunately there are only results and no games when chris carason games do not include the games that they talk about. see http://home.interact.se/~w100107/level.htm for list of the programs that played against humans and their rating based on the games. > > >> >>The rating of the good programs in the list were too high so they decided 1 or 2 >>years ago to reduce the rating of all programs by 100 elo to make the rating of >>the programs in the top of the list more realistic against humans. > >And now the height is ok? How did you prove it? I did not prove it but I think that most people agree that 2841 for Fritz7 on A1200 is at least 100 elo too high so reducing the number by 100 elo reduce the difference relative to humans. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.