Author: Peter Fendrich
Date: 07:59:50 06/04/02
Go up one level in this thread
On June 04, 2002 at 06:29:48, Rolf Tueschen wrote: >On June 04, 2002 at 04:50:45, Peter Fendrich wrote: > >>On June 03, 2002 at 22:20:08, Allen Lake wrote: >> >>>On June 03, 2002 at 20:39:35, Dann Corbit wrote: >>> >>>>There are lies, damn lies, and statistics. The big problem with statistics is >>>>that 99% of the world's population has no idea what they might possibly mean. >>>>Therefore, when they see them, they draw all sorts of incorrect assertions from >>>>them. >>> >>>I wondered when somebody was going to say this :) And I applaud your example of >>>the two lists below: >>> >>>> Program Elo + - Games Score Av.Op. Draws >>>> >>>> 1 LG2000V3 : 2589 97 197 31 77.4 % 2375 6.5 % >>>> 2 Yace 0.99.50 : 2586 31 104 188 86.2 % 2268 12.8 % >>>> 3 MAD-005 : 2583 37 110 145 83.1 % 2306 10.3 % >>>> 4 Crafty-18.10 : 2580 102 138 29 75.9 % 2381 27.6 % >>>> 5 Comet-B37 : 2554 103 129 31 71.0 % 2398 25.8 % >>>> 6 TCBishop-4601 : 2542 103 165 30 73.3 % 2367 13.3 % >>>> 7 Gromit3 : 2527 99 100 36 66.7 % 2407 33.3 % >>>> 8 Nejmet-260 : 2526 109 163 28 71.4 % 2367 14.3 % >>>> 9 Phalanx-xxii : 2522 102 153 33 68.2 % 2390 9.1 % >>>> 10 AnMon-509 : 2518 33 83 195 81.0 % 2266 14.4 % >>>> 11 Amy-07 : 2507 36 106 157 82.5 % 2238 9.6 % >>>> 12 TCBishop-0045 : 2503 112 130 29 67.2 % 2378 24.1 % >>>> 13 AnMon-510 : 2497 36 98 162 81.5 % 2240 11.1 % >>>> 14 ZChess-222 : 2492 31 75 230 78.9 % 2263 12.6 % >>>> 15 GLC-213 : 2469 112 120 33 60.6 % 2395 18.2 % >>>> 16 ZChess-120 : 2452 35 70 194 74.7 % 2264 14.4 % >>>> 17 Gromit2 : 2434 125 81 32 53.1 % 2412 43.8 % >>>> 18 Pepito-121 : 2432 126 117 28 58.9 % 2369 25.0 % >>>> 19 Ant-606 : 2429 110 110 37 56.8 % 2382 16.2 % >>>> 20 FranWB-090 : 2427 36 63 202 71.5 % 2267 15.3 % >>>> >>>> Program Elo + - Games Score Av.Op. Draws >>>> >>>> 1 LG2000V3 : 289 97 197 31 77.4 % 75 6.5 % >>>> 2 Yace 0.99.50 : 286 31 104 188 86.2 % -32 12.8 % >>>> 3 MAD-005 : 283 37 110 145 83.1 % 6 10.3 % >>>> 4 Crafty-18.10 : 280 102 138 29 75.9 % 81 27.6 % >>>> 5 Comet-B37 : 254 103 129 31 71.0 % 98 25.8 % >>>> 6 TCBishop-4601 : 242 103 165 30 73.3 % 67 13.3 % >>>> 7 Gromit3 : 227 99 100 36 66.7 % 107 33.3 % >>>> 8 Nejmet-260 : 226 109 163 28 71.4 % 67 14.3 % >>>> 9 Phalanx-xxii : 222 102 153 33 68.2 % 90 9.1 % >>>> 10 AnMon-509 : 218 33 83 195 81.0 % -34 14.4 % >>>> 11 Amy-07 : 207 36 106 157 82.5 % -62 9.6 % >>>> 12 TCBishop-0045 : 203 112 130 29 67.2 % 78 24.1 % >>>> 13 AnMon-510 : 197 36 98 162 81.5 % -60 11.1 % >>>> 14 ZChess-222 : 192 31 75 230 78.9 % -37 12.6 % >>>> 15 GLC-213 : 169 112 120 33 60.6 % 95 18.2 % >>>> 16 ZChess-120 : 152 35 70 194 74.7 % -36 14.4 % >>>> 17 Gromit2 : 134 125 81 32 53.1 % 112 43.8 % >>>> 18 Pepito-121 : 132 126 117 28 58.9 % 69 25.0 % >>>> 19 Ant-606 : 129 110 110 37 56.8 % 82 16.2 % >>>> 20 FranWB-090 : 127 36 63 202 71.5 % -33 15.3 % >>>> >>>>Notice (however) that the highest ELO is 2589 in the first list and 289 in the >>>>second. Yet this is irrelevant. The only thing that matters are the >>>>differences. >>> >>>I know you've tried long and hard to make this point over the months I've been >>>reading here, Dann. Unfortunately, that second type of list, no matter how >>>statistically valid, is not going to "sell" many chess programs -- and isn't >>>that what most of the major rating lists are _really_ about? This is not to say >>>that SSDF and its list don't provide a valuable consumer resource concerning the >>>commercial chess programs. >> >>I'm not sure what you mean by that but SSDF does compute ratings soley based on >>the differences between the players. The list is in the next step adjusted based >>on some games between programs vs humans. The level of the list is intended to >>be of about the same level as the FIDE list. They will of course not be be >>interchangeable and can never be but it gives some feeling of where the level is >>compared to humans. > >No way, Peter! As you write it was about "some" games. This is _not_ >correct calibrating. At that time you did that programs would have lost to 100% >against experts, masters, IM and GM. But then afterwards you started the comp vs >comp fantasy. And today you claim GM level on the basis of "some" games against >patzers. I don't say that you did intend that ast the beginning. But it's the >description of the historical truth. > >Rolf Tueschen I don't really know what games are used today but that is not very important from my point of view. There is no correct calibration, whatever method you are using. They ars not the same pool of players and even if you calibrate, lets say, Fritz rating according to USSF or FIDE (also two different pools that would indeed give different ratings to Fritz) you can't know for sure that the other ratings in the list are calibrated. They are probably not. You have to live with the fact that we only can get about the same levels in rating. Look at the German ratings compared to FIDE. I haven't seen them but can bet on that they are different. Peter
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.