Author: Vine Smith
Date: 23:26:36 05/21/01
Go up one level in this thread
On May 21, 2001 at 23:34:00, Christophe Theron wrote: >On May 21, 2001 at 18:00:04, Vine Smith wrote: > >>On May 21, 2001 at 14:16:52, Christophe Theron wrote: >> >>>On May 21, 2001 at 06:41:02, Vine Smith wrote: >>> >>>>On May 21, 2001 at 02:15:05, Christophe Theron wrote: >>>> >>>>>On May 21, 2001 at 01:32:07, Vine Smith wrote: >>>>> >>>>>>On May 20, 2001 at 14:47:12, Christophe Theron wrote: >>>>>> >>>>>>>On May 20, 2001 at 14:26:15, Vine Smith wrote: >>>>>>> >>>>>>>>On May 20, 2001 at 13:24:29, Christophe Theron wrote: >>>>>>>> >>>>>>>>>On May 20, 2001 at 04:25:41, Frank Phillips wrote: >>>>>>>>> >>>>>>>>>>On May 19, 2001 at 23:48:48, Christophe Theron wrote: >>>>>>>>>> >>>>>>>>>>>On May 19, 2001 at 23:37:31, Ratko V Tomic wrote: >>>>>>>>>>> >>>>>>>>>>>>>I'm extremely surprised that my creature managed to survive more >>>>>>>>>>>>> than 30 moves, given a 300 times speed handicap. >>>>>>>>>>>> >>>>>>>>>>>>The flip side is that the current programs running at some >>>>>>>>>>>>future machines at 300 GHz won't be able to crush the current >>>>>>>>>>>>programs on 1 GHz any more convincingly (in terms of how >>>>>>>>>>>>many moves the slower machine can hang on) than what happened >>>>>>>>>>>>in this matchup. >>>>>>>>>>>> >>>>>>>>>>>>This is the same effect that many players have experienced >>>>>>>>>>>>when upgrading their hardware to 2-3 times faster one and >>>>>>>>>>>>then being disapponted, after all the expense and hopes, >>>>>>>>>>>>when they can't even notice any difference in the perceived >>>>>>>>>>>>program strength (aginst humans). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>You are absolutely right. >>>>>>>>>>> >>>>>>>>>>>I think we are already beginning to experience the effects of dimishing returns >>>>>>>>>>>in chess on current hardware at long time controls. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Christophe >>>>>>>>>> >>>>>>>>>>Would someone take the time to explain this simply and clearly, to me. I can >>>>>>>>>>understand that if you are already beating humans (or some other group of >>>>>>>>>>players) most of the time, then increasing the speed still means you are beating >>>>>>>>>>them most of the time and maybe a bit more, but until a machine can see _all_ >>>>>>>>>>there is to see how would it not improve by seeing more and how can you say >>>>>>>>>>(apriori) that it will improve only at a diminishing return? In other words, I >>>>>>>>>>can believe that results against a set of players is aysomtopic, tending towards >>>>>>>>>>100 percent, but do see why this is necessarily true of the game played by two >>>>>>>>>>otherwiseequally matched entities. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>In my opinion it has to do with the fact that in a given chess position the >>>>>>>>>number of moves is limited. Generally you have between 20 and 50 legal moves. >>>>>>>>> >>>>>>>>>From these moves, only an even more limited subset does not lead to an obvious >>>>>>>>>loss. >>>>>>>>> >>>>>>>>>And from this subset there is an even more limited subset of moves (2 or 3 >>>>>>>>>generally) that can be played, and chosing between them is a matter of >>>>>>>>>preference because the amount of computation needed to prove which one is better >>>>>>>>>is too big for any computer. >>>>>>>>> >>>>>>>>>So once you reach the stage where you can see which 2 or 3 moves are playable, >>>>>>>>>it would take an additional huge computation to see further. >>>>>>>>> >>>>>>>>>I think some chess programs on current computers at long time controls have >>>>>>>>>already reached this stage, and this is why is becomes increasingly difficult to >>>>>>>>>say which one is better. >>>>>>>>> >>>>>>>>>This is a very simplistic explanation which lacks mathematical support, I know, >>>>>>>>>but that's how I explain dimishing returns. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Christophe >>>>>>>> >>>>>>>>Is it possible that there is also a problem with bad evaluations infecting whole >>>>>>>>branches in the tree of analysis? In Fritz vs. Gambit Tiger at Leiden, Fritz >>>>>>>>played 21.b4, shutting in its queen. Was this not a dreadful move? Yet, I had >>>>>>>>Fritz analyze after this point through 18 ply, and the evaluation was just +0.06 >>>>>>>>(after which it mysteriously halted analysis). And Tiger 14 has reached 20 ply >>>>>>>>looking at this same position, with an evaluation of just +0.46 after 21...Nc3 >>>>>>>>22.Rd3 Qf6 23.Kf1 Ne4 24.Bd4 Qf7 25.Bb2 g5 26.Rde3 Bf4 27.Bxe4 fxe4 28.Rxe4 Rxe4 >>>>>>>>29.Rxe4 Qxd5 30.Re7. Actually, the final position is lost for White after >>>>>>>>30...Qd3+ 31.Re2 Qb1+ 32.Ne1 Bf5, but White doesn't need to play 30.Re7. The >>>>>>>>point is that neither program, given even 10-12 hours to think (on a PIII 850) >>>>>>>>appreciates the disastrous effects of White's missing queen. As poor evaluations >>>>>>>>like this clog up the search, all lines begin to look like one another, despite >>>>>>>>huge differences between them that would be clear to any human player examining >>>>>>>>these positions. >>>>>>>>Regards, >>>>>>>>Vine Smith >>>>>>> >>>>>>> >>>>>>> >>>>>>>I do not agree. >>>>>>> >>>>>>>Tiger KNOWS about the bad position of the Queen after b4 and would never play >>>>>>>this move. >>>>>>> >>>>>>>If you try, you will see that Tiger's evaluation is different in the lines the >>>>>>>queen is trapped and in the lines it is not. >>>>>>> >>>>>>>The evaluation difference is not big, but it is enough to avoid such a >>>>>>>disastrous move in almost all the cases, and to try to find a way to free the >>>>>>>queen if it happens to be trapped by a long sequence of forced moves. >>>>>>> >>>>>>>Tiger is able to identify some cases of blocked pieces or pieces with poor >>>>>>>mobility in its evaluation. In particular, it is able to see that the queen is >>>>>>>blocked after b4? and gives a penalty for this. I have worked hard in this part >>>>>>>of the evaluation, so I can't let you generalize and say that any program would >>>>>>>ignore the consequences of the trapped queen. Mine knows. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Christophe >>>>>>Hi -- >>>>>>First of all, I want to say that I like Tiger 14 and Gambit Tiger 2.0 very much, >>>>>>and was quite impressed by the game against Patzer at Leiden. That was some >>>>>>terrific chess! >>>>> >>>>> >>>>>Thanks! >>>>> >>>>> >>>>> >>>>>>But wouldn't you like to see a better evaluation from Tiger than +0.46 after >>>>>>21.b4? That kind of score would make me think (if I didn't see the position), >>>>>>"Oh, Tiger must have the two bishops and more space; or the opponent has a >>>>>>couple of weak pawns." I would never imagine that instead, the opponent's most >>>>>>powerful piece had been locked in by pawns and rendered completely immobile. The >>>>>>evaluation leads me to wonder if Tiger would happily win a pawn at the expense >>>>>>of liberating the queen, which seems like too low a price for such a gift to the >>>>>>opponent. >>>>>>I must also point out that 17 ply deep, the evaluation was +0.48, and the line >>>>>>shown was 21...Nc3 22.Rd3 Qf6 23.Kf1 Ne4 24.Bd4 Qf7 25.Rde3 Bf4 26.Rd3 c6 >>>>>>27.dxc6 Bxc6 28.Bb3 Bd5 29.Bxd5 Qxd5 30.Qb6 Qa2. In this line, with 26...c6, >>>>>>Tiger releases the queen, and for what? Of course, as it approaches this point >>>>>>during actual play, it may change its mind, but its decision about what the >>>>>>correct 21st move was for Black (at 17 ply deep) was based on a line in which it >>>>>>gratuitously released the queen. >>>>>>This was Tiger 14 doing the analysis -- is there any difference from Gambit >>>>>>Tiger 2.0 regarding how the trapping penalty is applied? >>>>> >>>>> >>>>> >>>>>First I would like to point out that my program would NOT play b4. >>>>> >>>>>Don't forget that the b4 mistake has been played by Fritz, not by Tiger. So if >>>>>you want to blame somebody........ :) >>>>> >>>>>Then you can argue about the value I'm using for the penalty for a trapped >>>>>queen. >>>>> >>>>>Here is how I set such a penalty value, usually: I set it as low as possible. >>>>>High enough so the program understands that there is a problem and does not play >>>>>the faulty move, AND low enough so it is not going to interfere in a crazy way >>>>>with the program's playing style. >>>>> >>>>>The idea behind this is that, as the programmer, I cannot think about all the >>>>>consequences of such a penalty. When I introduce such a penalty in the >>>>>evaluation it's because I have a set of positions where it is supposed to help, >>>>>but no set of positions can have a good enough statistical significance. So >>>>>there are obviously a lot of cases where the penalty will be counter productive, >>>>>and I have no idea of what these positions will be (sure, I will soon discover >>>>>some in test games, if I set the penalty too high). >>>>> >>>>>So you might think that the trapped queen is worth one pawn or more, but still >>>>>I'll give it a much lower weight in Tiger's evaluation. >>>>> >>>>>And you must also take into account the fact that the trapped queen is not only >>>>>bad "by itself" (queen mobility = 0), but it is also bad because of the >>>>>consequences, and don't forget that the search is going to catch (understand) >>>>>some of these consequences (the search might be able see that white is unable to >>>>>defend against an attack because the queen cannot come near). >>>>> >>>>>So usually you do not need to give a really high positional penalty. A little >>>>>penalty PLUS the additional positional problems that are going to be found by >>>>>the search might come close to the overall penalty you would have given >>>>>yourself, as a human player. >>>>> >>>>> >>>>> >>>>> Christophe >>>>Hi -- >>>> >>>>I certainly hope I didn't imply that Tiger would play 21.b4. In fact, I checked >>>>this right after interrogating Fritz about this move, and Tiger, I believe (I >>>>don't have the analysis at hand), chose 21.b3. I don't really understand this >>>>choice, either, but many programs go this way, some mentioned by Sune Larsson in >>>>an earlier post, and I found that Yace 0.99.01 selected this as well. >>>>Your explanation of how the penalty is applied was very helpful in understanding >>>>the evaluations. I suppose that if it is set too high, the program would >>>>possibly be overly casual about giving up pawns, or even pieces. >>>>I do understand that the immobility of the queen is not in and of itself going >>>>to win the game -- in fact, Gambit Tiger's approach versus Fritz seems to be >>>>quite correct, open up the game, create some weaknesses to attack, and the >>>>queen's absence will be keenly felt. Any attempt to win the queen appears to >>>>fail, as I have investigated this myself, and also Tiger's 20-ply search after >>>>21.b4 strongly implies this is not possible. The fact that the punishment for >>>>the crime takes so long makes this an excellent example of long-term positional >>>>play. >>>>But one last question: pretend that Fritz's 21.b4 is forced, and that I am using >>>>Tiger to analyze starting from, let's say Black's move 17, and that besides the >>>>course in the game, there is one other significant line, all the others are no >>>>good for Black. For the other significant line, there is an evaluation of +0.60, >>>>and this is based on "normal" factors, such as pawn structure and superior minor >>>>pieces. So when I run an 18-ply analysis from move 17, attempting to learn the >>>>truth of the position, won't Tiger show me the other significant line, rather >>>>than the more favorable queen-trap line due to the "artificially" low penalty? >>>>And given the choice, would Tiger steer towards the hypothetical other line, or >>>>does it have some way of recognizing that the numerically lower evaluation >>>>actually represents a more favorable position? >>>>Thanks for taking the time to discuss all of this with me. >>>>Regards, >>>>Vine >>> >>> >>> >>>Only numerical factors are taken into account, so if there is a line scoring >>>+0.48 and another scoring +0.60, Tiger will go for +0.60. >>> >>>I accept the responsability for chosing low weights for such penalties, even if >>>it is possible to show that in *some* positions the penalty should have been set >>>at a higher value. For any of such positions I'll be able to find another >>>position where my low penalty is already too much and is causing a disaster. >>> >>>All that counts is the results in real life, which can be measured only by >>>estimating the overall strength of the program, not individual positions. >>> >>>That's why I wouldn't blame Frans too much either for chosing to ignore some >>>parameters as trapped queens or king safety. The strength of his program >>>advocates for his choices, and it is the duty of other programmers to prove he's >>>wrong, if they can. >>> >>> >>> >>> Christophe >>Definitely, your approach seems very reasonable, and the practicalities of chess >>programming require such decisions. But going back to the original point, where >>there was speculation about the future 300 GHz machines, would this not imply >>that the additional plies of search might be quite useless in many cases? As the >>number of nodes increases, so do the possibilities of running into one of these >>"exception" positions, where the evaluation does not match the position's true >>potential. > > > >No, this is not what happens. > >These errors happen all the time and do not have a disastrous impact. > >Search beyond the positions where positional misunderstanding happens corrects >the positional mistakes. > >As you have noticed, the last plies of the best lines displayed by a computer >are often full of positional mistakes, but the first plies are much more >accurate. > >A deep search will show mistakes in the last few plies, but a majority of the >first plies is going to be accurate. > >Increasing the number of positions searched indeed increases the number of >positions where the evaluation is wrong, but it does not increase the PROPORTION >of these particular positions. So their impact does not increase, and deeper >searches still produce better results. > > > > >> Eventually, the possibility will approach 100%, and the evaluations >>will blur into a continuum where tactically sound lines all appear roughly >>equivalent to each other, separated by differences of hundredths of a pawn in >>not entirely meaningful fashion. >>I think if Kramnik considers the type of weakness Fritz displayed in the game >>against Gambit Tiger, he can steer play towards positions requiring long-term >>decisions and absolutely dominate even against Deep Fritz on an 8-processor >>machine. The flaw is more general than trapped pieces, it extends to any >>long-term factor that cannot be given a "proper" score due to its short-term >>effects on the program's play. But I don't "blame" Fritz, or its programmers, I >>just hope that someday a solution will be found to handle these exception cases. > > > >I'm not sure it's an important problem to focus on. > >It's what you want to see from a human point of view because it would please >your sense of beauty, but nothing proves that it really improves the playing >strength. > >I have added knowledge about trapped pieces because it was a mean to improve the >playing strength of my program, but I'm not ready to increase the penalty given >to trapped pieces to the value a human player would evaluate them just for the >beauty of it. > > > > Christophe Okay, at this point I give up and just wish you and the twin Tigers the best of luck at future tournaments! Thanks again for your input. Regards, Vine
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.