Author: Mike S.
Date: 06:51:07 06/13/04
Go up one level in this thread
On June 12, 2004 at 21:50:04, Rolf Tueschen wrote: >On June 12, 2004 at 19:43:27, Mike S. wrote: > >>(...) >Now if you repeat only one time such a nonsense about Bob in CSS, you get no >more response from me. You want to doubt that his "opinion" is correct, that >Stellungstests are not able to make Elo ratings for computerchess programs??? I never have claimed that testsuites can produce true Elo ratings like they evolve from game competition! Actually I don't know anybody who claims this. I cannot understand why people are adressing *me* as if I would claim that. What is this based upon?? I don't say this, the WM-Test staff doesn't say this either. Testsuites produce *test ratings*. Often, people use only the number of solutions to compare the engine's performances, or that and the time usage is "somehow" combined in a reasonable formula, to allow comparisons between the engines. - That's it; I see no big secret and no problem whatsoever. Take a look at my own test, the Quicktest: The ratings are within a bandwidth of (roughly) 200...450 points. I could have added a kind of rating base of ~2300 points which would just result in numbers in an elo-like range ("visually"), but *I did not* because IMO it makes no sense and it gives no additional information value. It would not change rankings nor differences. The only approximation to (computer) Elo numbers I tried to build into that is, that a (theoretical) speed doubling of engine should result in a gain of ~50+ test rating points. (More than ~50 can be gained when many additional solutions are found.) http://members.aon.at/computerschach/quick/quick2.htm >>I'd want a reply by Prof. Hyatt confirming that he takes notice that the WM-Test >>respects that principle, and does of course not count solutions which are >>"forgotten" later during the testing time. (I would have wanted that he would >>have informed himself better, before explaining why "such tests are basically >>flawed" when the WM-Test being the main topic in this thread here, isn't one of >>such tests...) > > >But his "opinion" is the same! He knows how to do such tests and he does still >say, like Uri and Ed, that Stellungstests, in English Position Tests, can NOT >substantiate exact (Elo like) ratings. That was NEVER my point!! I was ONLY talking about the test condition of stable solutions, now. This really stinks! One point of the critizism is proven invalid, so you and H. simply skip it and come up with the Elo like ratings. I'm not interested in that sub topic. I know that testuites cannot produce true FIDE or SSDF Elos. I'm not defending "Test Elos" (although, there indeed exists something which really deserves that name; see below). Hint: You can also calculate scientifically correct true Elo ratings from soccer results. :-) Elo isn't adaptable for chess games only... What comes up next off-the-cuff :-)) ? >But the whole work of Gurevich&Meiler is >only reasonable in its limited design, if they pretend to get a valuable result. >They give typical Elo ratings. But you want to imply that nobody NEVER presented >such numbers. Nonsense! Of course they did it. They just *called* it Elo, or Test-Elo. Doesn't anybody notice anyway, that these numbers aren't game-based? So what? Ok - the word, or name, "Elo" may have been used in a wrong place. But that is not a very scientific argument against the test method, quality, importance and interpretability of the results :-)) because *all these remain just the same*, no matter if the ratings are called "Elo," "Test-Elo," "Testrating," or "Schakiquaki Numbers"... It remains the same test, the same method, the same positions and the same validity of the engines performance comparisons no matter how you call it. So that critizism is simply ridiculous and I feel mucked. Btw. I being told that the WM-Test ratings calculated with the original formula, aren't called Test-Elo anymore (but ...Rating). So I guess it's a perfect test now by your standards, because your biggest point of critizism has been removed. :-)) Btw. as you will know, there's an alternative ratings calculation method for testsuite results, EloStatTS, which really uses the original Elo system (!). It was developed by Dr. Frank Schubert, a scientist (AFAIK of mathematics). He would guarantee that these numbers are scientifically based, true & perfect Elo ratings. Indeed! :-)) I don't know if that destroys your world view. If you want to know details about that (please don't ask me about it): http://www.computerschach.de/freeware/index.htm http://www.computerschach.de/freeware/EloStatTS1_0_3.zip (The WM-Test download inculdes ratings calculated using this, too.) (I'm not convinced that this alternative method is more useful, or better than some simpler formulas (I've spotted a big weakness). I mention it just as an illustration what's all possible, can be downloaded and studied, and that in fact the true Elo system can be used to calculate testsuite ratings...) >You dare to judge about one Bob Hyatt with pondering if he could fit into your online interviews???? This seems to be a big misunderstanding; of course he would have perfectly fit in our "gallery" of interviewed computerchess VIPs (we already had done such events with the Junior team, Ch.Theron, Ch.Donninger, SMK...) It would have been an honour. I'm aware of his historic achievements and his current work and successes. A big hero. (Still, I take the freedom of having another opinion than his, then and when... :-)) But I don't feel motivated to be active in that direction anymore after latst impressions. I don't feel the WM-Test & CSS was being treated fair in this recent discussion. (Actually CSS is *never* treated fair in CCC, but that's nothing new. Play your games without me in the future...) I guess I'm leaving now, finally (probably I'll check back once or twice to read replies, only). Regards, M.Scheidl
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.