Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: Show it's predictive power.

Author: Bertil Eklund
Date: 16:04:11 12/10/99
On December 10, 1999 at 15:24:45, Charles Unruh wrote:

>On December 10, 1999 at 14:49:07, Dann Corbit wrote:
>
>>On December 10, 1999 at 14:34:22, Charles Unruh wrote:
>>>On December 10, 1999 at 13:24:32, Roger wrote:
>>[snip]
>>>>Recalibration is inevitable, because GM-computer games will become much more
>>>>common.
>>>
>>>
>>>Well not inevitable as no one is currently doing it, but hopefully it will get
>>>done soon, we may already have enough rebel games to start with, if not, then
>>>with a few more games with a few willing masters on ICC and we'll have enough to
>>>rate that program.  As for Junior, I suspect there are enough games if not then
>>>close. And a few games on ICC or FICS for that matter and it would be a good go
>>>to start recalibrating the ssdf, to finally go ahead and prove that i was right
>>>all along(Heck that ICC was right based on the ICC poll we had cincerning GM
>>>strength.  Most people thought Comps were GM strength!) :).
>>
>>Rebel's GM challenge is an example where human players of known strength are
>>pitted against a computer under tournament conditions and time control with
>>money at stake.
>>
>>So you are mistaken that nobody is doing it.
>
>No I am not mistaken.  I said that nobody was recalibrating the ssdf.  I said
>that with these results the ssdf could be recalibrated, which was the point of
>me startying the thread!!

Ok yoy are right, and totally wrong about the SSDF-list is (was) calibrated
against human players during tournament conditions and not in single-game
matches with double increment time-controls. It´s like comparing a 1500 m runner
against a participant in a hepathlon contest in his last branch (sorry I don´t
know the name). I guess it is like me a lot of people who have participiate in
tournaments in a another town (country) and remember who they feeled in the last
round after a lot of dining and wining. Compare these results against the weekly
play in your club. At least a computer can!

Bertil SSDF
>  However, it will require that the
>>SSDF use hardware equivalent to what Ed is using for some tests in order to
>>understand how the tests relate to one another.
>
>I do believe for total accuracy we would need to of course test against the
>rebel on the hardware that was used.  Though, i think their could be some
>calculation done with some relative closeness of a howmuch strength rebel might
>lose in strength on the lesser hardware though this would certainly not be the
>most reccomended thing to do.
>
>Also, it is a single program
>>and there are not a lot of games.  But no one besides Ed seems to be ready to
>>put their money where their mouth is.
>
>Well that's fine that it's a single program(though i did mention that Junior
>seems to have a good number of games too),  If i was 1600 and played a 2200 20
>games and lost every game then that wouldn't tell me much about my strength.
>however if i was 2160 strength and played a 2200, losing and drawing some in a
>20 game match vs a 2200 that would tell me something about my strength.  Since
>the progs follow this second scenario of being relatively close in strength 20
>games against a single program say H7 vs Rebel(with the calculated rating from
>human games), that would tell me something( at least something more) about the
>strength of H7.  It of course would be more Ideal to have at least 3 progs
>tested  vs humans.  I suggested a starting point of Junior and Rebel.  H7 may
>have some games on record i don't know of them.  I picked Rebel, because the
>games are in a series, and aren't being artificially selected out.  Meaning
>that, I could probably round up 6 games of H6, but that would be artificial,
>because i could pick all wins, which wouldn't make sense to do.  Though i do
>think it would be ok to say add together 2 tournaments.  Say H7 played in
>tournament "A" and Tournament "B"  then as long as i included all games i feel
>that it would be satisfactory to add "A"+"B" to calculate a rating.  As i was
>saying I think Junior may have a record that fits the just mentioned scenario.
>>
>>I don't think that ICC or FICS games should be compared to try and calibrate a
>>rating.  I doubt if the GM's are very serious most of the time and expect that
>>they are using the computers as sounding boards to test strategies for tactical
>>weaknesses.  If I were a GM, that's what I would use them for.
Re: Show it's predictive power. Charles Unruh 17:36:11 12/10/99
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.