Author: Charles Unruh
Date: 12:24:45 12/10/99
Go up one level in this thread
On December 10, 1999 at 14:49:07, Dann Corbit wrote: >On December 10, 1999 at 14:34:22, Charles Unruh wrote: >>On December 10, 1999 at 13:24:32, Roger wrote: >[snip] >>>Recalibration is inevitable, because GM-computer games will become much more >>>common. >> >> >>Well not inevitable as no one is currently doing it, but hopefully it will get >>done soon, we may already have enough rebel games to start with, if not, then >>with a few more games with a few willing masters on ICC and we'll have enough to >>rate that program. As for Junior, I suspect there are enough games if not then >>close. And a few games on ICC or FICS for that matter and it would be a good go >>to start recalibrating the ssdf, to finally go ahead and prove that i was right >>all along(Heck that ICC was right based on the ICC poll we had cincerning GM >>strength. Most people thought Comps were GM strength!) :). > >Rebel's GM challenge is an example where human players of known strength are >pitted against a computer under tournament conditions and time control with >money at stake. > >So you are mistaken that nobody is doing it. No I am not mistaken. I said that nobody was recalibrating the ssdf. I said that with these results the ssdf could be recalibrated, which was the point of me startying the thread!! However, it will require that the >SSDF use hardware equivalent to what Ed is using for some tests in order to >understand how the tests relate to one another. I do believe for total accuracy we would need to of course test against the rebel on the hardware that was used. Though, i think their could be some calculation done with some relative closeness of a howmuch strength rebel might lose in strength on the lesser hardware though this would certainly not be the most reccomended thing to do. Also, it is a single program >and there are not a lot of games. But no one besides Ed seems to be ready to >put their money where their mouth is. Well that's fine that it's a single program(though i did mention that Junior seems to have a good number of games too), If i was 1600 and played a 2200 20 games and lost every game then that wouldn't tell me much about my strength. however if i was 2160 strength and played a 2200, losing and drawing some in a 20 game match vs a 2200 that would tell me something about my strength. Since the progs follow this second scenario of being relatively close in strength 20 games against a single program say H7 vs Rebel(with the calculated rating from human games), that would tell me something( at least something more) about the strength of H7. It of course would be more Ideal to have at least 3 progs tested vs humans. I suggested a starting point of Junior and Rebel. H7 may have some games on record i don't know of them. I picked Rebel, because the games are in a series, and aren't being artificially selected out. Meaning that, I could probably round up 6 games of H6, but that would be artificial, because i could pick all wins, which wouldn't make sense to do. Though i do think it would be ok to say add together 2 tournaments. Say H7 played in tournament "A" and Tournament "B" then as long as i included all games i feel that it would be satisfactory to add "A"+"B" to calculate a rating. As i was saying I think Junior may have a record that fits the just mentioned scenario. > >I don't think that ICC or FICS games should be compared to try and calibrate a >rating. I doubt if the GM's are very serious most of the time and expect that >they are using the computers as sounding boards to test strategies for tactical >weaknesses. If I were a GM, that's what I would use them for.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.