Author: Rolf Tueschen
Date: 15:29:27 12/07/02
Go up one level in this thread
I was away the whole day, but I found some good ideas for the answer, Bob. On December 06, 2002 at 20:11:31, Bob Durrett wrote: >On December 06, 2002 at 19:48:09, Rolf Tueschen wrote: > ><snip> > >>Also a debate between you and me and others here is the best what could happen >>because that is interdisciplinary cooperation. You could bring the very best of >>your talents into the debate because others might go visiting on too many >>tangents... then you organize the recovery! >> >>Rolf Tueschen > >My debating skills are worse than those of a newborn baby! I know my >limitations. That is my one great strength [I think.] Besides, there are other >productive formats for discourse besides debate. Brainstorming. >_ _ _ _ _ _ _ _ _ _ _ _ > >But I would like to get back to your ideas regarding chess software. > >In particular, your feeling that it would not be possible to measure the >strength of a chess engine [or a human either, for that matter] by using a set >of test positions. Objection. For a human it is always a good test of how many "expressions" of chess technique he's mastering. What I said wasn't that tests were nonsense. But positional positions were impossible for tests for computers. > >When students graduates from college with a Bachelor's Degree, here in the USA, >they are encouraged to take a comprehensive exam which is intended to indicate >whether or not the student learned anything. [Versus wasting several years.] > >I had to take such a test. As an electrical engineer, I was required to take >the GRE Advanced Test in Engineering. I did very well on that test and was >admitted to Graduate School primarily for that reason. > >I would like to suggest that, if I had to take such a test, it is only fair that >every chess engine should have to take an equivalent test too! Now that is one of the sentences I couldn't understand yesterday. For what purpose you want to order that? And also to your big test there, I would say that it would be also a good test if a professor would talk with a student. Because then he could well see if the student had understood something. I don't trust test suites too much. Multiple choice technology is extremely fallacious. > >The test would be very comprehensive. It would include five or ten suites of >test positions. Perhaps 500 positions in all, minimum. A new set of positions >would be used each year. > >In the proposed scenario, the testing organization should have the >responsibility and resources necessary to design and adjust the tests to match >the SSDF results. Again I can't understand you. What has the incredible invalide test results of SSDF to do with our question of position testing? > >In other words, I propose a comprehensive test which has, itself, been tested >and verified against the SSDF [and similar] test data. Excuse me but this is a logical impossibility. You can't expect to create something out of "nothing". That would be called magic. :) > >If you stick to your guns on this, you will assert that the proposed idea would >fail miserably. Right? But why would it fail? Could you be specific, please? > >: ) : ) : ) : ) : ) : ) : ) : ) : ) : ) : ) : ) : ) : ) > >Bob D. Ok I stick to my guns. Shooting mode ON: Now the creative part of my Saturday reflections on the road in the Cold. In Germany "cold" is already 1 degree Celsius plus. <g> I want to prove now one and forever why these matches in SSDF don't say something interesting about strength of chess programs. Programmers of the World please listen! (sorry, it's shooting mode) What is a chess game? ====================== a) GM chess A chess game between masters is a inter-related combination of the correct application of GM technique and chains of chess positions. A GM can win a game either by 1) better sophisticated application or 2) error of the opponent or 3) just by luck of position that both could not foresee (all along the Law that the concrete positions dominate in chess, not the 'technique' or wishful thinking or believing in magic). Point 3) is interesting, because when nearer to the special concreteness both masters know the solution but only one side can make profit of. b) amateur 1500 chess That kind of chess is also a chain of concrete positions but after a small opening period (learned by heart) almost all positions are being treated by chance without correct technique. The amateur has a lot of ideas but can't judge what idea is appropriate in the concrete position. The amateur believes that his idea is the power itself that could cause winning. In reality a wrong move could well win because of a false reaction by the opponent. But since the next move could also be weaker one false reaction normally doesn't lose. We could sum up that such a game is a series of moves by chance and in the end normally luck of position is winning if the many mistakes didn't lead to a clear material advantage before. c) chess programs chess After the extremely high level opening period tactical positions (appearing by chance or clever choice of the opening books and their depth)are played with extremely exact technique from both sides with the follow-up of either material advantage and later win or by chance of position one of the three possible results. Now my interest was focussing on those games with positions after the opening period whithout tactics. Of course that must happen in games against humans bbecause machines can't play positional chess and they will always search for tactics. Also if they are not there! ;) Now in these positional positions it's again a matter of chance. With all possible combinations. A basically better position (from GM perspective) could lead to loss, draw or win, the same all other positions could lead to these three results. Because all is happening in chance mode. 'Digging in fog' mode. Now let's quickly make logical conclusions. If that is happening in games this would also happen in positions in tests. (I'm still talking about c)!) Now the irony is that the test creators calculate there results for their positional positions following the evaluations on the display of the programs and not following the understanding of the machines. Reasonably because we don't know a thing about this "understanding". Programmer could say (see Bob Hyatt) that his program would know the story of the two advanced pawns and their strength, but you could never show that with test positions. Because the final position is - Rolf said it before - too late to be sure that the prog has understood the topic. But if we go back a few moves in the development of the position the chance factor could destroy our beautiful idea. Only to mention the new discovery of the defenders of such tests. They insinuate that if something should go wrong in a position that doesn't matter because in the "overall" 100 positions this would be "ausgemittelt". A creative term out of magic. It's basically the argument that i or 10 nonsense positions would be not so important if we base our final judgement on 100 positions. I would say true, if not the 10 nonsense positions are all in the positional test part. Rolf dixit! <g> Back to SSDF games. All these games are in the end (the small sector of real chess in-between opening books and endgame tables!) a chain of chance moves in blindness mode. And since the blindness of each war generation (still shooting mode ON) is universal (I liked the term in the reasoning of Michael Gurevich in his come back speech) the best programs of the same generation are always in the same neighborhood. But that SSDF result was known before because it's a universal LAW! So as always I mention that the whole results in SSDF are invalide. You could only test chess programs in games against human GM. Because then you leave the sphere of chance and magic with the general restriction of the basic chance factor through dominationg concreteness in chess (where even GM are helpless, but against progs this influence could still be properly controlled, with the exception of exhibition matches again, where the GM must also take care of the commercial interests of the company, see Bahrain...). Please tell me if you could need some of these thoughts, Bob. Rolf Tueschen
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.