Author: blass uri
Date: 10:45:40 05/31/99
Go up one level in this thread
On May 31, 1999 at 11:44:15, Melvin S. Schwartz wrote: > >On May 30, 1999 at 16:03:40, Daniel Karlsson wrote: > >>On May 30, 1999 at 11:24:22, Melvin S. Schwartz wrote: >> >>[Snip] >>> >>>The amount of points you speak of is of a hypothetical nature. If you like >>>comparing software with A at B speed and C at D speed, then we just simply >>>disagree. >>> >>>Regards, >>>Mel >> >>Suppose program A on hardware B gets a 70% score against C on D. Now if E on B >>gets a 75% score against C on D, wouldn't that be a good indication that E on B >>is stronger than A on B, i.e. E is stronger than A on the same hardware. >> >>Match AB and EB against several opponents, calculate ratings from the scores and >>you get a pretty good rating list. This is basically what SSDF are doing. > >You may get a pretty good idea of a rating but how accurate is it? I suspect we >are dealing with a strong assumption here. It may be the only way that SSDF can >do it, but Shep's site is where you'll find tournaments where programs ARE >competing against each other on EQUAL hardware. Now, I am compelled to believe >Shep's results with more authenticity than SSDF's method. The reason to believe more the ssdf results is because the ssdf are based on more games relative to Shep's games and every top program has more than 100 games(usually some hundreds of games). The main problem that I have with the ssdf results is that most of the games are not public so I cannot check if there are mistakes in these results. I found in the past mistakes in one match Junior5(p200)-Rebel8(P90) when Junior5 was slowed down by a significant factor in 4 games because the tester ran another application in the same time. The tester repeated the games. It is impossible to discover these mistakes when the games are not public. Another problem I have >with SSDF is their opponents for Chessmaster 6000 have an average rating, if >memory serves me well, more than 100 points below that of Hiarcs7 or Fritz 5.32 >just to name a few. In my opinion, and this is just my opinion, I believe they >should confine their testing to the top programs because that may allow them to >use the same hardware for all. Also, since we know for example that Hiarcs 7 is >better than Hiarcs 6, why do they continue to test Hiarcs 6? This applies to >other outdated programs as well. Do you see industry extensively testing newer >and better cars, computers, TV's, etc., when it has already been etablished that >new products from the same maufacturer are superior to their older models? Usually new programs are better than old programs but it is not obvious. There is at least one case when the ssdf results does not show improvement and Mchess8's rating is worse than Mchess7. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.