Author: Ed Schröder
Date: 10:48:00 03/16/98
Go up one level in this thread
>Posted by Enrique Irazoqui on March 16, 1998 at 07:18:30: >>Unrealistic view Dirk :)) >>Remember the programmers are to blame and not the SSDF! >>The programmers have taken advantage of AUTO232 adding special >>software to their released programs. Rebel8 (just in time) came >>with a memory resident learner only active when running AUTO232. >>This to prevent Rebel8 being outplayed on SSDF. >>It works good as long as you don't interrupt a match. If a match is >>interrupted the learner results are lost. Hiarcs3 has a similar system. >>There was no time (due to release pressure) to change the software and >>save the learn results permanently on disk. >>Rebel9 came with a book learner. Results are now saved on disk. Avoid >>lost games, repeat won games. In principal programmed for the public >>but as a CLEAR second goal not to be slaughtered on SSDF. >>Let's face it, book learning is way of out control in comp-comp games. >>Just look at the SSDF Fritz5 - Rebel8 (P90) 31.5 - 8.5 and count the >>doubles. >>This weekend I checked these 40 games and found out that the Rebel8 >>memory resident learner is bypassed by either temporarily interrupting >>the F5-R8 match or not starting R8 with the "A" parameter. >>I understand the SSDF tester, he wants to use his PC's for other >>purposes too. However my point is that if the F5(200) - R8(P90) >>40 game match is replayed without any interruption the 31.5 - 8.5 >>score (by far) will not be reproduced because of the R8 memory >>resident learner. >>I have checked these 40 games and found out that in almost all >>cases the R8 learner was not active. >Which means it's a design flaw of R8's learner, and not a wrong way to >test by the SSDF. Did I say that? On the contrary I think, see above. My point is that changing the way you test will give different results. >Games are played. In which order they are played it doesn't matter. >The problem is that this partcular result is not realistic. Here you say it. And IMO the same applies for the F5-Junior 31.5 - 8.5 result. Do you really think this is the real difference in playing strength? Junior being first in Paris. Ring a bell? >Proposing to >the SSDF, or to anyone else, to test the way best suited for each >program is not realistic either, simply because it is not feasible. >There are many programs and many games. >Let's assume all programs have a properly designed learner. It writes to >disk, it avoids falling for the same losing line. Program A plays >program B, both with good learners. Program A tries to repeat a winning >line. Program B will avoid it. This way they mutually tend to minimize >the effect of learners. As a consequence, the engines are measured and >we have no winning double games. Why not throw learners out? They hide the real playing strength of a chess engine. See my below "double game" definition. >>It's my opinion that the solution is very simple and is fully in >>the hands of the SSDF themselves. >>#1. Remove all the doubles. >This is not practical. First, we have to define what a double is. It was >the subject of a long discussion a long time ago. I disagree with your "not practical" statement. If it is so essential you have to practice it! Definition of a double game for COMP_X vs COMP_Y games... Whenever X or Y leaves the book. Simple as that. A second situation is a double game. Get rid of double games. What do they contribute to comp-comp games? Nothing in my opinion. >Second, and for the above reasons, competent learners should take care >of losing the same game twice. Therefore, no losing doubles anymore. Easy to say, difficult to program. Moreover you indirectly say: "The best learner wins the jackpot".... How about the chess engine? Isn't that the main goal of SSDF? >Third, for statistical accuracy we want very many games played. Testers >can not check them all one by one. A little utility can do the job. Secondly the above described double game definition is implemented since Rebel7. Easy to program. >>#2. Accept only general accepted AUTO232 software also available >>for the public to check. >I am very much in favor, basically to avoid suspicion. But I don't see >how this relates to this issue. >>SSDF is in full control. >>They set the rules. >>They have my trust. >>In the meantime you may give me your advice what to do. >>#1. Spent 3-4-5 months of my time to write the perfect comp-comp >>learner? Goal: ELO 2900 on SSDF but in reality 400 points less? >Learners should neutralize each other. As a result, this 400 points >difference is not real. If you can recognize the opponent everything is possible. Scenario.... Play 200-300 games against a SSDF opponent. You then have a learned book especially tuned on that opponent. Save the new book. Repeat that for every expected SSDF opponent. Save the new books. Release the program with these optimized opponents books. Being in AUTO232 recognize the opponent and load the "prepared" book. I am not in the mood to put energy in that. It's also a clear cheat. However if you manage you can enter SSDF with 2900. >>#2. Forget about SSDF and fully concentrate on the engine and >>useful new features? >Once the learner is designed, I guess it's the end of the problem. Am I >missing something? Yes you miss that learners are in the state of the Boris computer of the late 70ths. So much to improve. >>#3. Resign from SSDF, Rebel not on SSDF anymore, this in combination >>with point #2? >Shooting on your own foot? If we have no SSDF, who is going to tell that >program A is any better than mass-market programs selling for 25% the >price? I don't know how important SSDF is. Based on previous years I would say a first place on SSDF gives you 10-15% extra sellings. Not much. I think I will give it a try for the successor of Rebel9 and see if I can do without SSDF. More features in Rebel that's for sure. >Besides, I think it is a mistake to blame testers for what is basically >a programmer's problem. Who said SSDF is to blame? Blame the programmers for their AUTO232 tricks. Blame SSDF for doing nothing on the problem. However in the right order..... >>#4. Leave things as they are? >Definitely. With better learners. I disagree. Learners hide the strength of an engine in comp-comp games. My vote is to get rid of them on SSDF simply by excluding double games. Then (again) you will see the naked ELO of the tested chess engine. So what do you want? a) The naked ELO of the tested chess engine on SSDF. or b) An ELO you should add/subtract 50-200 ELO points because of learning. You pick.... - Ed - >Enrique
This page took 0.03 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.