Author: Uri Blass
Date: 13:33:48 12/03/02
Go up one level in this thread
On December 03, 2002 at 15:52:33, Robert Hyatt wrote: >On December 03, 2002 at 13:41:24, Uri Blass wrote: > >>On December 03, 2002 at 12:54:24, Rolf Tueschen wrote: >> >>>Until now nobody out of the programmer group had ever spoken about that evident >>>truth. SMK says that these tests can't show the strength of play or as it was >>>claimed for this test, the "ability to analyse". SMK also explained (for the >>>first time in that direct speech) how he and every programmer could fake the >>>results of such tests. He then speaks about the question if it could be >>>discovered, as it was by T. Mally in case of Ed Schröder, and he saud that of >>>course he could do it so that nobody could find out. In fact he had written such >>>a "tool", but in the end he decided to let it out of the commercial product. >>> >>>But all this gives me the opportunity to talk about the reasons why such a >>>testing with even these top class positions is nonsense. And why it has nothing >>>to do, well, almost nothing, with _real_ strength. >>> >>>I think I can show you why in special for those allegedly positional positions >>>the test is nonsense and that he's measuring something else, but not analysing >>>power of the engine. >>> >>>I will keep it very short so that you can do your own research. >>> >>>(Just to mention that I asked for that problem already two years ago as >>>'Schachfan' in CSS forum, but then it went about a tactical mate position). >>> >>>Look, if you have a positional game of chess, where do you choose the point for >>>a test? Of course, in this WM-Test of Gurevich et al you take the position when >>>exactly a certain by the experts well commented and mostly beautiful move has >>>been made. Because there the commentators said: only with this move he could >>>conservate the slight advantage.But the truth is that often the engines see - in >>>their actually possible realm - two solutions very closely together. And in >>>general it could be said that for positional positions without tactics the evals >>>are not very impressive at all. So, how could you calculate it in your results? >>>Would you really take a difference of 0.01 points as decisive? Is that relevant? >>> >>>But the main problem of such test positions is this. >>> >>>The point of that "nice move" (that caught th attention of the commentators) is >>>by no means the most important moment for the decision making. Let me explain >>>the irony. The usual commentators are masters themselves. Well, and therefore >>>they take certain decisions as completely normal, because they are easy and >>>trivial for _them_, but not so for the amateurs. Or the machines so to speak. >>>But now go with me bachwards a few moves. How optimistic you are that we could >>>then expect that a machine would be better prepared to make the right decision >>>in such _positional_ games? And that is exactly the point for these test >>>positions. _Realistically_ we had to test the machines in positions, where only >>>experienced humans know how to play to be later in the position to make some >>>"decisive" moves, moves then commented by our experts. Only the early positions >>>would allow a verdict if our actual machines could play posiional chess. We know >>>already the answer. They can't for the moment. >>> >>>But therefore such tests with such a great pretension are a fake, a hoax in >>>themselves. And Stefan MK explained it with the possible distinction. In reality >>>M. Gurevich is making a question of life or death out of it. But earlier >>>somewhere I already mentioned that it's ridiculous to claim the honor for a so >>>called, guess that, I translate, World Champion Test. These positions are simply >>>taken from Wch matches. What a thrill! But it's known for ages that the chess of >>>these matches is not always the best possible. Because it's mainly a >>>psychological fight. And fortunately Gurevich didn't claim that he were testing >>>psychology. But just now it was published that one position wasn't from Wch >>>chess at all. A game between Anand and Shirov. And to make the scandal even >>>greater. The authors used a false position. Instead the K stood on c7, they put >>>him on d7. But with Kc7 we have two solutions. The searched Ng5 and now the odd >>>Bg5 too. Christ! A whole life work of a few hours of choosing some positions out >>>of Wch games is in danger to lose all reputation. Doctor doctor, gimmi the >>>news...! >>> >>>Rolf Tueschen >>> >>>On December 03, 2002 at 09:26:42, Eduard Nemeth wrote: >>> >>>>Very interesting post from SMK in CSS Forum (only german). >>>> >>>>Please read it, i thing that a translation is interesting for You! >>>> >>>>Read here: >>>> >>>>http://f23.parsimony.net/forum50826/messages/54995.htm >> >>positional test suites are not impossible. >> >>I think that the known test suites are not good for that purpose and I also >>believe that it is not easy to build them so I prefer tactical test suites. >>I believe that there is a lot of room for improvement in tactics. >> >>positional test suites should not be always positions that are hard for humans >>and they may include also positions that are easy for humans but hard for part >>of the computers. >> >> >>A possible way to build them may be to analyze a lot of games of computers from >>the ssdf games and find the positional mistakes that were done by the programs >>and the target can be to avoid the mistakes. > > > >you miss the point. For a tactical position, it is easy to show that the >winning tactical >idea is correct and winning beyond a doubt. For a positional test position, the >program >can make the right move for the right reason, or it might make it for the wrong >reason, >but both get the same score. About the only way to do this is to create >positions where >there are attractive (but wrong) moves that could be played, and see if the >program plays >them. If it plays a bad move, it clearly doesn't understand the issue. If it >plays the right >move, you only know that it doesn't appear to not understand things, but it also >could >just be lucky. > >> >>The main problem is to agree about the positional mistakes. >> >>There are a lot of cases when computers can translate positional advantage that >>they do not understand to positional advantage that they understand so if most >>of the programs agree after a long search that the move is correct then it is >>going to be an evidence that the move is correct. >> >>You can still define the move as positional move because the tactic that >>computers see after a long search is not tactics of winning material but winning >>better pawn structure or better mobility. > >But a program without mobility analysis can still make the right move for the >wrong >reason so the test will be worthless... It may happen in one position but if the test have enough positions from a lot of games then I do not expect a program that knows nothing to be always lucky. I think that the right positional test may be productive to test changes in program's evaluation but the test is not enough and programmers should do more tests. The side who does more mistake is not the side that is losing in chess. The importance of one big mistake may be bigger than the importance of 2 small misakes. It is possible that a change in the evaluation is producing less mistakes but does not do the program better because the new big mistakes are more important than the old small mistakes. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.