Author: Vine Smith
Date: 15:00:04 05/21/01
Go up one level in this thread
On May 21, 2001 at 14:16:52, Christophe Theron wrote: >On May 21, 2001 at 06:41:02, Vine Smith wrote: > >>On May 21, 2001 at 02:15:05, Christophe Theron wrote: >> >>>On May 21, 2001 at 01:32:07, Vine Smith wrote: >>> >>>>On May 20, 2001 at 14:47:12, Christophe Theron wrote: >>>> >>>>>On May 20, 2001 at 14:26:15, Vine Smith wrote: >>>>> >>>>>>On May 20, 2001 at 13:24:29, Christophe Theron wrote: >>>>>> >>>>>>>On May 20, 2001 at 04:25:41, Frank Phillips wrote: >>>>>>> >>>>>>>>On May 19, 2001 at 23:48:48, Christophe Theron wrote: >>>>>>>> >>>>>>>>>On May 19, 2001 at 23:37:31, Ratko V Tomic wrote: >>>>>>>>> >>>>>>>>>>>I'm extremely surprised that my creature managed to survive more >>>>>>>>>>> than 30 moves, given a 300 times speed handicap. >>>>>>>>>> >>>>>>>>>>The flip side is that the current programs running at some >>>>>>>>>>future machines at 300 GHz won't be able to crush the current >>>>>>>>>>programs on 1 GHz any more convincingly (in terms of how >>>>>>>>>>many moves the slower machine can hang on) than what happened >>>>>>>>>>in this matchup. >>>>>>>>>> >>>>>>>>>>This is the same effect that many players have experienced >>>>>>>>>>when upgrading their hardware to 2-3 times faster one and >>>>>>>>>>then being disapponted, after all the expense and hopes, >>>>>>>>>>when they can't even notice any difference in the perceived >>>>>>>>>>program strength (aginst humans). >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>You are absolutely right. >>>>>>>>> >>>>>>>>>I think we are already beginning to experience the effects of dimishing returns >>>>>>>>>in chess on current hardware at long time controls. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Christophe >>>>>>>> >>>>>>>>Would someone take the time to explain this simply and clearly, to me. I can >>>>>>>>understand that if you are already beating humans (or some other group of >>>>>>>>players) most of the time, then increasing the speed still means you are beating >>>>>>>>them most of the time and maybe a bit more, but until a machine can see _all_ >>>>>>>>there is to see how would it not improve by seeing more and how can you say >>>>>>>>(apriori) that it will improve only at a diminishing return? In other words, I >>>>>>>>can believe that results against a set of players is aysomtopic, tending towards >>>>>>>>100 percent, but do see why this is necessarily true of the game played by two >>>>>>>>otherwiseequally matched entities. >>>>>>> >>>>>>> >>>>>>> >>>>>>>In my opinion it has to do with the fact that in a given chess position the >>>>>>>number of moves is limited. Generally you have between 20 and 50 legal moves. >>>>>>> >>>>>>>From these moves, only an even more limited subset does not lead to an obvious >>>>>>>loss. >>>>>>> >>>>>>>And from this subset there is an even more limited subset of moves (2 or 3 >>>>>>>generally) that can be played, and chosing between them is a matter of >>>>>>>preference because the amount of computation needed to prove which one is better >>>>>>>is too big for any computer. >>>>>>> >>>>>>>So once you reach the stage where you can see which 2 or 3 moves are playable, >>>>>>>it would take an additional huge computation to see further. >>>>>>> >>>>>>>I think some chess programs on current computers at long time controls have >>>>>>>already reached this stage, and this is why is becomes increasingly difficult to >>>>>>>say which one is better. >>>>>>> >>>>>>>This is a very simplistic explanation which lacks mathematical support, I know, >>>>>>>but that's how I explain dimishing returns. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Christophe >>>>>> >>>>>>Is it possible that there is also a problem with bad evaluations infecting whole >>>>>>branches in the tree of analysis? In Fritz vs. Gambit Tiger at Leiden, Fritz >>>>>>played 21.b4, shutting in its queen. Was this not a dreadful move? Yet, I had >>>>>>Fritz analyze after this point through 18 ply, and the evaluation was just +0.06 >>>>>>(after which it mysteriously halted analysis). And Tiger 14 has reached 20 ply >>>>>>looking at this same position, with an evaluation of just +0.46 after 21...Nc3 >>>>>>22.Rd3 Qf6 23.Kf1 Ne4 24.Bd4 Qf7 25.Bb2 g5 26.Rde3 Bf4 27.Bxe4 fxe4 28.Rxe4 Rxe4 >>>>>>29.Rxe4 Qxd5 30.Re7. Actually, the final position is lost for White after >>>>>>30...Qd3+ 31.Re2 Qb1+ 32.Ne1 Bf5, but White doesn't need to play 30.Re7. The >>>>>>point is that neither program, given even 10-12 hours to think (on a PIII 850) >>>>>>appreciates the disastrous effects of White's missing queen. As poor evaluations >>>>>>like this clog up the search, all lines begin to look like one another, despite >>>>>>huge differences between them that would be clear to any human player examining >>>>>>these positions. >>>>>>Regards, >>>>>>Vine Smith >>>>> >>>>> >>>>> >>>>>I do not agree. >>>>> >>>>>Tiger KNOWS about the bad position of the Queen after b4 and would never play >>>>>this move. >>>>> >>>>>If you try, you will see that Tiger's evaluation is different in the lines the >>>>>queen is trapped and in the lines it is not. >>>>> >>>>>The evaluation difference is not big, but it is enough to avoid such a >>>>>disastrous move in almost all the cases, and to try to find a way to free the >>>>>queen if it happens to be trapped by a long sequence of forced moves. >>>>> >>>>>Tiger is able to identify some cases of blocked pieces or pieces with poor >>>>>mobility in its evaluation. In particular, it is able to see that the queen is >>>>>blocked after b4? and gives a penalty for this. I have worked hard in this part >>>>>of the evaluation, so I can't let you generalize and say that any program would >>>>>ignore the consequences of the trapped queen. Mine knows. >>>>> >>>>> >>>>> >>>>> Christophe >>>>Hi -- >>>>First of all, I want to say that I like Tiger 14 and Gambit Tiger 2.0 very much, >>>>and was quite impressed by the game against Patzer at Leiden. That was some >>>>terrific chess! >>> >>> >>>Thanks! >>> >>> >>> >>>>But wouldn't you like to see a better evaluation from Tiger than +0.46 after >>>>21.b4? That kind of score would make me think (if I didn't see the position), >>>>"Oh, Tiger must have the two bishops and more space; or the opponent has a >>>>couple of weak pawns." I would never imagine that instead, the opponent's most >>>>powerful piece had been locked in by pawns and rendered completely immobile. The >>>>evaluation leads me to wonder if Tiger would happily win a pawn at the expense >>>>of liberating the queen, which seems like too low a price for such a gift to the >>>>opponent. >>>>I must also point out that 17 ply deep, the evaluation was +0.48, and the line >>>>shown was 21...Nc3 22.Rd3 Qf6 23.Kf1 Ne4 24.Bd4 Qf7 25.Rde3 Bf4 26.Rd3 c6 >>>>27.dxc6 Bxc6 28.Bb3 Bd5 29.Bxd5 Qxd5 30.Qb6 Qa2. In this line, with 26...c6, >>>>Tiger releases the queen, and for what? Of course, as it approaches this point >>>>during actual play, it may change its mind, but its decision about what the >>>>correct 21st move was for Black (at 17 ply deep) was based on a line in which it >>>>gratuitously released the queen. >>>>This was Tiger 14 doing the analysis -- is there any difference from Gambit >>>>Tiger 2.0 regarding how the trapping penalty is applied? >>> >>> >>> >>>First I would like to point out that my program would NOT play b4. >>> >>>Don't forget that the b4 mistake has been played by Fritz, not by Tiger. So if >>>you want to blame somebody........ :) >>> >>>Then you can argue about the value I'm using for the penalty for a trapped >>>queen. >>> >>>Here is how I set such a penalty value, usually: I set it as low as possible. >>>High enough so the program understands that there is a problem and does not play >>>the faulty move, AND low enough so it is not going to interfere in a crazy way >>>with the program's playing style. >>> >>>The idea behind this is that, as the programmer, I cannot think about all the >>>consequences of such a penalty. When I introduce such a penalty in the >>>evaluation it's because I have a set of positions where it is supposed to help, >>>but no set of positions can have a good enough statistical significance. So >>>there are obviously a lot of cases where the penalty will be counter productive, >>>and I have no idea of what these positions will be (sure, I will soon discover >>>some in test games, if I set the penalty too high). >>> >>>So you might think that the trapped queen is worth one pawn or more, but still >>>I'll give it a much lower weight in Tiger's evaluation. >>> >>>And you must also take into account the fact that the trapped queen is not only >>>bad "by itself" (queen mobility = 0), but it is also bad because of the >>>consequences, and don't forget that the search is going to catch (understand) >>>some of these consequences (the search might be able see that white is unable to >>>defend against an attack because the queen cannot come near). >>> >>>So usually you do not need to give a really high positional penalty. A little >>>penalty PLUS the additional positional problems that are going to be found by >>>the search might come close to the overall penalty you would have given >>>yourself, as a human player. >>> >>> >>> >>> Christophe >>Hi -- >> >>I certainly hope I didn't imply that Tiger would play 21.b4. In fact, I checked >>this right after interrogating Fritz about this move, and Tiger, I believe (I >>don't have the analysis at hand), chose 21.b3. I don't really understand this >>choice, either, but many programs go this way, some mentioned by Sune Larsson in >>an earlier post, and I found that Yace 0.99.01 selected this as well. >>Your explanation of how the penalty is applied was very helpful in understanding >>the evaluations. I suppose that if it is set too high, the program would >>possibly be overly casual about giving up pawns, or even pieces. >>I do understand that the immobility of the queen is not in and of itself going >>to win the game -- in fact, Gambit Tiger's approach versus Fritz seems to be >>quite correct, open up the game, create some weaknesses to attack, and the >>queen's absence will be keenly felt. Any attempt to win the queen appears to >>fail, as I have investigated this myself, and also Tiger's 20-ply search after >>21.b4 strongly implies this is not possible. The fact that the punishment for >>the crime takes so long makes this an excellent example of long-term positional >>play. >>But one last question: pretend that Fritz's 21.b4 is forced, and that I am using >>Tiger to analyze starting from, let's say Black's move 17, and that besides the >>course in the game, there is one other significant line, all the others are no >>good for Black. For the other significant line, there is an evaluation of +0.60, >>and this is based on "normal" factors, such as pawn structure and superior minor >>pieces. So when I run an 18-ply analysis from move 17, attempting to learn the >>truth of the position, won't Tiger show me the other significant line, rather >>than the more favorable queen-trap line due to the "artificially" low penalty? >>And given the choice, would Tiger steer towards the hypothetical other line, or >>does it have some way of recognizing that the numerically lower evaluation >>actually represents a more favorable position? >>Thanks for taking the time to discuss all of this with me. >>Regards, >>Vine > > > >Only numerical factors are taken into account, so if there is a line scoring >+0.48 and another scoring +0.60, Tiger will go for +0.60. > >I accept the responsability for chosing low weights for such penalties, even if >it is possible to show that in *some* positions the penalty should have been set >at a higher value. For any of such positions I'll be able to find another >position where my low penalty is already too much and is causing a disaster. > >All that counts is the results in real life, which can be measured only by >estimating the overall strength of the program, not individual positions. > >That's why I wouldn't blame Frans too much either for chosing to ignore some >parameters as trapped queens or king safety. The strength of his program >advocates for his choices, and it is the duty of other programmers to prove he's >wrong, if they can. > > > > Christophe Definitely, your approach seems very reasonable, and the practicalities of chess programming require such decisions. But going back to the original point, where there was speculation about the future 300 GHz machines, would this not imply that the additional plies of search might be quite useless in many cases? As the number of nodes increases, so do the possibilities of running into one of these "exception" positions, where the evaluation does not match the position's true potential. Eventually, the possibility will approach 100%, and the evaluations will blur into a continuum where tactically sound lines all appear roughly equivalent to each other, separated by differences of hundredths of a pawn in not entirely meaningful fashion. I think if Kramnik considers the type of weakness Fritz displayed in the game against Gambit Tiger, he can steer play towards positions requiring long-term decisions and absolutely dominate even against Deep Fritz on an 8-processor machine. The flaw is more general than trapped pieces, it extends to any long-term factor that cannot be given a "proper" score due to its short-term effects on the program's play. But I don't "blame" Fritz, or its programmers, I just hope that someday a solution will be found to handle these exception cases. Regards, Vine
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.