Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Show it's predictive power.

Author: Charles Unruh

Date: 12:24:45 12/10/99

On December 10, 1999 at 14:49:07, Dann Corbit wrote:

>On December 10, 1999 at 14:34:22, Charles Unruh wrote:
>>On December 10, 1999 at 13:24:32, Roger wrote:
>[snip]
>>>Recalibration is inevitable, because GM-computer games will become much more
>>>common.
>>
>>
>>Well not inevitable as no one is currently doing it, but hopefully it will get
>>done soon, we may already have enough rebel games to start with, if not, then
>>with a few more games with a few willing masters on ICC and we'll have enough to
>>rate that program.  As for Junior, I suspect there are enough games if not then
>>close. And a few games on ICC or FICS for that matter and it would be a good go
>>to start recalibrating the ssdf, to finally go ahead and prove that i was right
>>all along(Heck that ICC was right based on the ICC poll we had cincerning GM
>>strength.  Most people thought Comps were GM strength!) :).
>
>Rebel's GM challenge is an example where human players of known strength are
>pitted against a computer under tournament conditions and time control with
>money at stake.
>
>So you are mistaken that nobody is doing it.

No I am not mistaken.  I said that nobody was recalibrating the ssdf.  I said
that with these results the ssdf could be recalibrated, which was the point of
me startying the thread!!

  However, it will require that the
>SSDF use hardware equivalent to what Ed is using for some tests in order to
>understand how the tests relate to one another.

I do believe for total accuracy we would need to of course test against the
rebel on the hardware that was used.  Though, i think their could be some
calculation done with some relative closeness of a howmuch strength rebel might
lose in strength on the lesser hardware though this would certainly not be the
most reccomended thing to do.

Also, it is a single program
>and there are not a lot of games.  But no one besides Ed seems to be ready to
>put their money where their mouth is.

Well that's fine that it's a single program(though i did mention that Junior
seems to have a good number of games too),  If i was 1600 and played a 2200 20
games and lost every game then that wouldn't tell me much about my strength.
however if i was 2160 strength and played a 2200, losing and drawing some in a
20 game match vs a 2200 that would tell me something about my strength.  Since
the progs follow this second scenario of being relatively close in strength 20
games against a single program say H7 vs Rebel(with the calculated rating from
human games), that would tell me something( at least something more) about the
strength of H7.  It of course would be more Ideal to have at least 3 progs
tested  vs humans.  I suggested a starting point of Junior and Rebel.  H7 may
have some games on record i don't know of them.  I picked Rebel, because the
games are in a series, and aren't being artificially selected out.  Meaning
that, I could probably round up 6 games of H6, but that would be artificial,
because i could pick all wins, which wouldn't make sense to do.  Though i do
think it would be ok to say add together 2 tournaments.  Say H7 played in
tournament "A" and Tournament "B"  then as long as i included all games i feel
that it would be satisfactory to add "A"+"B" to calculate a rating.  As i was
saying I think Junior may have a record that fits the just mentioned scenario.
>
>I don't think that ICC or FICS games should be compared to try and calibrate a
>rating.  I doubt if the GM's are very serious most of the time and expect that
>they are using the computers as sounding boards to test strategies for tactical
>weaknesses.  If I were a GM, that's what I would use them for.

Re: Show it's predictive power. Bertil Eklund 16:04:11 12/10/99
- Re: Show it's predictive power. Charles Unruh 17:36:11 12/10/99

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.