Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Test Elo was never my point. Bye.

Author: Mike S.

Date: 06:51:07 06/13/04

Go up one level in this thread


On June 12, 2004 at 21:50:04, Rolf Tueschen wrote:

>On June 12, 2004 at 19:43:27, Mike S. wrote:
>
>>(...)

>Now if you repeat only one time such a nonsense about Bob in CSS, you get no
>more response from me. You want to doubt that his "opinion" is correct, that
>Stellungstests are not able to make Elo ratings for computerchess programs???

I never have claimed that testsuites can produce true Elo ratings like they
evolve from game competition! Actually I don't know anybody who claims this. I
cannot understand why people are adressing *me* as if I would claim that. What
is this based upon?? I don't say this, the WM-Test staff doesn't say this
either. Testsuites produce *test ratings*. Often, people use only the number of
solutions to compare the engine's performances, or that and the time usage is
"somehow" combined in a reasonable formula, to allow comparisons between the
engines. - That's it; I see no big secret and no problem whatsoever.

Take a look at my own test, the Quicktest: The ratings are within a bandwidth of
(roughly) 200...450 points. I could have added a kind of rating base of ~2300
points which would just result in numbers in an elo-like range ("visually"), but
*I did not* because IMO it makes no sense and it gives no additional information
value. It would not change rankings nor differences. The only approximation to
(computer) Elo numbers I tried to build into that is, that a (theoretical) speed
doubling of engine should result in a gain of ~50+ test rating points. (More
than ~50 can be gained when many additional solutions are found.)

http://members.aon.at/computerschach/quick/quick2.htm

>>I'd want a reply by Prof. Hyatt confirming that he takes notice that the WM-Test
>>respects that principle, and does of course not count solutions which are
>>"forgotten" later during the testing time. (I would have wanted that he would
>>have informed himself better, before explaining why "such tests are basically
>>flawed" when the WM-Test being the main topic in this thread here, isn't one of
>>such tests...)
>
>
>But his "opinion" is the same! He knows how to do such tests and he does still
>say, like Uri and Ed, that Stellungstests, in English Position Tests, can NOT
>substantiate exact (Elo like) ratings.

That was NEVER my point!! I was ONLY talking about the test condition of stable
solutions, now. This really stinks! One point of the critizism is proven
invalid, so you and H. simply skip it and come up with the Elo like ratings. I'm
not interested in that sub topic. I know that testuites cannot produce true FIDE
or SSDF Elos. I'm not defending "Test Elos" (although, there indeed exists
something which really deserves that name; see below).

Hint: You can also calculate scientifically correct true Elo ratings from soccer
results. :-) Elo isn't adaptable for chess games only...

What comes up next off-the-cuff :-)) ?

>But the whole work of Gurevich&Meiler is
>only reasonable in its limited design, if they pretend to get a valuable result.
>They give typical Elo ratings. But you want to imply that nobody NEVER presented
>such numbers. Nonsense! Of course they did it.

They just *called* it Elo, or Test-Elo. Doesn't anybody notice anyway, that
these numbers aren't game-based? So what?

Ok - the word, or name, "Elo" may have been used in a wrong place. But that is
not a very scientific argument against the test method, quality, importance and
interpretability of the results :-)) because *all these remain just the same*,
no matter if the ratings are called

"Elo,"
"Test-Elo,"
"Testrating," or
"Schakiquaki Numbers"...

It remains the same test, the same method, the same positions and the same
validity of the engines performance comparisons no matter how you call it. So
that critizism is simply ridiculous and I feel mucked.

Btw. I being told that the WM-Test ratings calculated with the original formula,
aren't called Test-Elo anymore (but ...Rating). So I guess it's a perfect test
now by your standards, because your biggest point of critizism has been removed.
:-))

Btw. as you will know, there's an alternative ratings calculation method for
testsuite results, EloStatTS, which really uses the original Elo system (!). It
was developed by Dr. Frank Schubert, a scientist (AFAIK of mathematics). He
would guarantee that these numbers are scientifically based, true & perfect Elo
ratings. Indeed! :-)) I don't know if that destroys your world view.

If you want to know details about that (please don't ask me about it):
http://www.computerschach.de/freeware/index.htm
http://www.computerschach.de/freeware/EloStatTS1_0_3.zip
(The WM-Test download inculdes ratings calculated using this, too.)

(I'm not convinced that this alternative method is more useful, or better than
some simpler formulas (I've spotted a big weakness). I mention it just as an
illustration what's all possible, can be downloaded and studied, and that in
fact the true Elo system can be used to calculate testsuite ratings...)

>You dare to judge about one Bob Hyatt with pondering if he
could fit into your online interviews????

This seems to be a big misunderstanding; of course he would have perfectly fit
in our "gallery" of interviewed computerchess VIPs (we already had done such
events with the Junior team, Ch.Theron, Ch.Donninger, SMK...) It would have been
an honour. I'm aware of his historic achievements and his current work and
successes. A big hero. (Still, I take the freedom of having another opinion than
his, then and when... :-)) But I don't feel motivated to be active in that
direction anymore after latst impressions. I don't feel the WM-Test & CSS was
being treated fair in this recent discussion. (Actually CSS is *never* treated
fair in CCC, but that's nothing new. Play your games without me in the
future...)

I guess I'm leaving now, finally (probably I'll check back once or twice to read
replies, only).

Regards,
M.Scheidl



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.