Author: Dann Corbit
Date: 11:31:50 09/28/99
Go up one level in this thread
On September 28, 1999 at 13:35:10, Ratko V Tomic wrote: >>While it may seem an arbitrary measure, they use 100 games as a cutoff because >>they believe that results based on less games are simply unreliable. > >Well, yes, the fewer games the greater uncertainty. My point was: how do you >distribute the uncertainty fairly or optimally? Do you take one company and >reduce its uncertainty (by sticking with N>100) and simultaneously greatly widen >the uncertainties for other companies/authors (since they get N=0 games on 450, >thus it takes extrapolation from 200 to approximate the correct program rank)? >Of course not. The overall uncertainty (sum of squares of individual >uncertainties) is minimized by making all the individual uncertainties equal to >each other. So mathematically their "optimization" method of uncertainties is >ridiculous. > >And as an "accidental" side-effect of that (at best mathematically inept) >decision, it just happened that this one company will get the top 4 spots on the >list, given to it BEFORE the first game ever started. SSDF defends that with all >the caveats and footnotes, they're not really saying that the top 4 on the list >are the 4 best ones. What a joke. If an organizer of a car race arranged the >conditions of the race so that the GM cars get top 4 spots upfront, BEFORE the >race even started, and then published the race "results," would you buy his >"explanation" that it was fair race, since if you read all the footnotes to the >chart, and apply some statistical formulas, fuel chemistry and a bit of >aerodynamics..., you will realize that the top 4 on the list are not really the >best 4 cars. > >Even the most benevolent/naive interpretation of their decision could only be >ineptitude in judging the uncertainties and mindless disregard of its >side-effects. If you really did understand the math (which you obviously *do not*) then you would *know* that the top entries are mathematical peers. They post the standard deviation as well, and all of the top entries are within one standard deviation of each other. >> Accusing them >> of selling out to corporate interests, without better than (IMO flimsy) >> circumstantial evidence, is not going to convince me that it is true. >> > >I don't think the evidence of _purposeful_ unfairness is flimsy (only the exact >mechanism behind it may be unknown). The SSDF decision is such that even a CB >marketing couldn't have picked a better one. Namely, according to SSDF itself >(see their web page), CB proposed the very same "uneven hardware" test, with >only the CB program running on the fast hardware, in return for providing that >hardware free of charge. So, the CB was effectively offering a bribe to obtain >an unfair edge against the competition. The SSDF on their web page proudly >explains how they refused this "help." So SSDF acknowledges it was perfectly >aware (of the obvious fact) that running other companies' products on slower >hardware gives an unfair edge to CB. Yet, in this cycle they did exactly that. >Why? I suspect that they will also run the other products on the fast hardware. The CB deal would have forbidden them from doing so. >At best (the most benevolent view) one could say that through their ineptitude >and naivete they allowed themselves to be slyly manipulated by CB (the SSDF site >shows some evidence of this manipulation, where CB manages to make somehow SSDF >use the proprietary CB autoplayer, in the face of protests from other >compainies/authors). Since SSDF folks read this board, they're welcome to >explain it their way. What is the alternative to the CB autoplayer?
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.