Author: Tina Long
Date: 20:33:07 12/10/99
Go up one level in this thread
On December 10, 1999 at 11:25:58, Charles Unruh wrote: >I wonder is it possible that Once we get 20 games of GM's against Rebel at 40/2 >(there also seems to be a good number of master and GM games vs Junior)that we >can then recalculate/readjust the ratings of the ssdf by first calculating the >ratings of other comps vs Rebel(with the calculated rating from it's 20 games.). > In fact We should certainly be able to find a few masters who are willing to >play Rebel or any comp at 40/2 over a few weeks period to go ahead and get a new >base rating for some progs to bring the SSDF back into line with Fide ratings as >an attempt to put this GM issue or should i say (2500 fide rating issue) to bed, >and show i was totally right once and for all :)! Hi Charles, A fine idea except: Rebel has changed (developed) during the course of the games so far, so the Rebel today is not the same Rebel that drew with Anaand. Hardware used has also changed. SSDF does not (officially?) test Rebel, at Ed's request, due to Rebel's problems playing with Autoplayer. So while the comparison could be done by SSDF they are (morally) not allowed to publish the results. I'd like to see some of these top programs entered in real round-robin tournaments, preferrably with a decent "computer board" and the computer hidden away, to lessen the distraction for the opponents. I think this is the best way to get a genuine ELO rating. There is still a problem in getting a rating though. The "program", as time goes by, will want to upgrade both software & hardware. Wouldn't it be great to see the computer and cables being carried up to the stage of the hall where the "top" games are being played. The post-game analysis would be interesting to see as the human implies he/she could have won if.... As the programs get better and the hardware gets faster, the chance of computers playing in "real" tournaments seems more remote. If we simply used the SSDF results to say: "these few programs are currently best of those being tested on this hardware" "the next few programs may be as good as the best but are probably not quite that good" "this program version X is probably a bit better than version X-1" then IMO the goals of SSDF would be more correctly interpreted. I think that using the SSDF table to say: "this program is rated yyyy" "this program is y points better than that program" is incorrect. As soon as an SSDF tested program plays enough Tournament games to get a rating the your recalibration idea is possible, in the realms of the +/- confidence level. And what if I take program A to a series of tournaments & acheive a rating of 2200, and You take the Same program (My copy of program A & my computer) to a series of tournaments & acheive a rating of 2500. After 30 or so games each, it is quite feasable such variances can occur. SSDF may have already recalibrated the whole list on 2200. What to do. It would be nice though to have some confidence about playing strength of computers. Hi guys, Tina Long
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.