Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Poll Question - Tournaments vs Matches

Author: Robert Hyatt

Date: 20:26:59 01/07/00

Go up one level in this thread


On January 07, 2000 at 19:48:48, Bertil Eklund wrote:

>On January 07, 2000 at 13:11:14, Robert Hyatt wrote:
>
>>On January 07, 2000 at 08:21:18, Bertil Eklund wrote:
>>
>>>On January 06, 2000 at 17:07:01, Robert Hyatt wrote:
>>>
>>>>On January 06, 2000 at 10:20:15, Graham Laight wrote:
>>>>
>>>>>On January 06, 2000 at 10:12:53, Robert Hyatt wrote:
>>>>>
>>>>>>I don't dismiss it out of hand.  But if I have a question about the
>>>>>>effectiveness of brain surgery, I ask the _surgeon_ and not the _patient_.
>>>>>>They have two entirely different perspectives.  The patient recovers fully.
>>>>>>He considers this procedure a revolution.  The doctor knows that only one of
>>>>>>20 will recover.  He considers it terribly risky.  Who is right?
>>>>>>
>>>>>>Chess program 'users' have one perspective from playing the programs.  The
>>>>>>authors have a completely different one, knowing all the things that are
>>>>>>missing, all the things the program does poorly, all the things it gets
>>>>>>into trouble with...
>>>>>>
>>>>>>Which perspective seems most accurate?  The user of a black box, or the person
>>>>>>that 'filled' the black box?
>>>>>
>>>>>Or the impartial evaluator of the black box?
>>>>
>>>>
>>>>That is the point.  You can _not_ evaluate the black box.  You can only evaluate
>>>>the results.  The brain surgery worked.  You consider it wonderful.  Only the
>>>>doctor knows all the difficulties he had during the surgery, how close he came
>>>>to losing the patient, etc. Because the doctor sees _inside_ the black box.
>>>>
>>>>That is why 'impartial evaluation' is not easy until we simply have a lot of GM
>>>>games to go on.  At present we don't.  My view from inside the black box shows
>>>>thousands of problem areas that need work.  It may be that my view is wrong, if
>>>>and only if the black box can produce results against GM players that I don't
>>>>expect.  The easy way out of this is to wait.  We are getting data.  We know for
>>>>sure that Rebel isn't going to have a 2700 TPR based on games so far, so the
>>>>2700 number for Tiger on the SSDF is grossly overinflated.  As Ed said, and as I
>>>>have said many times, I would consider a TPR of 2500 a remarkable result.  And
>>>>that isn't good enough to make a GM.
>>>
>>>Over and over again, this is not TPR it´s MPR, what are you going to do repeat
>>>this 500 times and it´s true. Ok this seems to work here on a lot of persons but
>>>it´s two different things.
>>>
>>>Bertil
>>
>>
>>And your point would be?  MPR or TPR doesn't mean a thing.  "PR" does.  A pure
>>performance rating.  It doesn't matter whether it comes from a match, from a
>>single tournament, or computed from a set of consecutive games.  The calculation
>>is identical in all three cases, the result is interpreted the same way.
>
>Hi!
>
>Again and again....The SSDF-list Is based on TOURNAMENT-games.Of course it´s a
>big difference playing one round or eleven against say eleven different players
>each and every day, at least I hope people playing in tournaments understand the
>difference.

This is going around and around in circles.  I used the term "TPR" to apply
to Rebel in the current set of games vs GM players.  Because _everyone_ knows
what a TPR really is, which is exactly the same thing as this new "MPR" term.
Except one is based on performance in a tournament, the other is based on
a match between two players.

SSDF isn't a tournament by any measure.  It is a bunch of matches.  Which
means that neither TPR nor MPR apply accurately to it.  But since both are
calculated _exactly_ the same way, it doesn't matter one hill of beans.  We
are talking about a performance rating and nothing else.  Calling any of them
TPR or MPR is actually _wrong_.  Since SSDF isn't a tournament or a match, it
is a combo of both.  And since Rebels games are not a match or a tournament
either.  But all we really care about is the performance rating over the games
it plays...

I don't get the play on words and semantic games here...






>
>I (we) don´t have any problem with the list, but it seems that other people have
>serious trouble with the level of the list. When we have a clear indication of
>the proper level  achieved in tournaments the list could be calibrated.
>We can´t adjust it because of some peoples mission to talk it down (or up)
>because they have prominent friends gut-feelings or connection with
>extra-terrestrials.



I think you guys do fine work.  But your upper rating numbers are 100%
irrational, and I don't think any serious chess player considers that the
top program would be anywhere near 2700 FIDE.

You didn't cause the inflation.  Your testing paradigm (computer vs computer
only) caused it.  The numbers are fine taken in the right context.  But trying
to convert them to FIDE is nonsense.


>
>As you and everyone else know the level of the list was ridiciosly close to the
>performance achieved in Aegon, except for the last year when increments was
>used.

Aegon wasn't 40/2, so that the programs performed a bit better than expected,
which might have helped.  But following that thought, the programs didn't
get 300 points better over the last 2-3 years at Aegon performances...  yet
they have gone sky-high on your list...


>
>Now again you tries to back up your opinion with the Rebel-games (about 20),
>played with increments and sometimes double-increments. As I have followed most
>of the games and have seen that in most of the games the human was in serious
>time trouble I guess the outcome in several of them haven´t been the same.
>
>Bertil
>


The rebel games are _so_ close to 40/2hr time controls it isn't funny.
There aren't any 'double increments'.  At the time control, both add the
appropriate amount of time to their opponent's clock and they play on.  It
is about as close to 40/2hrs as can be done on a chess server.





>
>
>>Match or Tournament is irrelevant in this context.  The term "performance
>>rating" is what is important.  However it is derived.  In this case, from a
>>consecutive series of games...
>>
>>It really isn't two different things at all.  And the rebel result isn't
>>really a MPR either, because in a match, the two opponents play multiple
>>games. This is _far_ closer to a tournament than a match, since each opponent
>>for Rebel is different.
>>
>>IMHO of course.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.