Author: Jan-Frode Myklebust
Date: 11:42:25 02/28/98
Go up one level in this thread
On February 28, 1998 at 14:14:48, Robert Hyatt wrote: >On February 28, 1998 at 11:17:43, Jan-Frode Myklebust wrote: > >>On February 28, 1998 at 10:55:31, Robert Hyatt wrote: >>[snip] >>> >>>But in any case, it would be nice to see only 200mmx vs 200mmx or >>>whatever, >>>and stop this unequal platform competition, since it doesn't provide any >>>useful information, basically... >> >>Doesn't it? If the SSDF results are supposed to give some sort of rating >>depending on the strengt of the programs/computer/whatever, I don't see >>why testing 200mmx vs. 90 don't give any usefull information. It's >>basicly the same as having a stronger player play a week one. The >>stronger side have more to loose, and little to gain. >> >>It's probably like beeing in a world of IM's, and suddenly a bunch of >>GM's turns up beating the crap out of the IM's. Should we then stop >>testing IM's against GM's? :) >> >>An other thing might be that the rating difference between the P90's and >>the P200mmx's are too close, but that will even out as more games are >>played. >> >>(Remember: the SSDF results has little to do with real life human chess) >> >>janfrode > >Here's my problem: > >Exactly how much does a 3X speed improvement increase a program's >rating? I have no idea. But if SSDF continues testing on both P90 and P200MMX, we'll soon find out. >I know of *no* way to accurately assess that, since it seems to be a >value >that is different from program to program. So when someone posts a >match >result of 24 wins and 6 losses, for a 4:1 win/lose ratio, what does that >mean? Over 200 rating points? Yes. But then you notice that the >winner >has a 3:1 speed advantage. So what do you conclude? I conclude that 'program on hardware' is over 200 rating points than 'other program on other hardware', which is what was measured. The SSDF isn't only there to test software. Remember, they also test handheld computers against these 200MMX's > >Over a bunch of games, with a bunch of opponents, on a bunch of >different >platforms, it is not hard to statistically evaluate the results. But >for >a single match with a single opponent with a constant hardware >advantage, >it is impossible for *me* to conclude whether the faster program is >better >or worse than the handicapped program. But if you go more into details of the results, you can see each entries results against the other entries. > >That was my *only* point. Not that the SSDF is providing any bad data, >or anything else. Only that for a single match with time odds, I don't >know how to figure out what part of that lopsided victory was due to a >better program and what part was due to the hardware advantage. > >If I had seen 24:6 with a 3:1 hardware edge, 16:14 on equal hardware, I >could figure that out. But in a single match, there are two degrees of >freedom, the program's skill and the hardware advantage. Either could >account for all, part or none of the results... Guess the SSDF isn't testing only "the programs skills", but more "the programs skills on this or that hardware". janfrode
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.