Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: ==> The future 300 GHz machines won't win very impressively either

Author: Christophe Theron
Date: 00:23:11 05/22/01
On May 22, 2001 at 02:26:36, Vine Smith wrote:

>On May 21, 2001 at 23:34:00, Christophe Theron wrote:
>
>>On May 21, 2001 at 18:00:04, Vine Smith wrote:
>>
>>>On May 21, 2001 at 14:16:52, Christophe Theron wrote:
>>>
>>>>On May 21, 2001 at 06:41:02, Vine Smith wrote:
>>>>
>>>>>On May 21, 2001 at 02:15:05, Christophe Theron wrote:
>>>>>
>>>>>>On May 21, 2001 at 01:32:07, Vine Smith wrote:
>>>>>>
>>>>>>>On May 20, 2001 at 14:47:12, Christophe Theron wrote:
>>>>>>>
>>>>>>>>On May 20, 2001 at 14:26:15, Vine Smith wrote:
>>>>>>>>
>>>>>>>>>On May 20, 2001 at 13:24:29, Christophe Theron wrote:
>>>>>>>>>
>>>>>>>>>>On May 20, 2001 at 04:25:41, Frank Phillips wrote:
>>>>>>>>>>
>>>>>>>>>>>On May 19, 2001 at 23:48:48, Christophe Theron wrote:
>>>>>>>>>>>
>>>>>>>>>>>>On May 19, 2001 at 23:37:31, Ratko V Tomic wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>>I'm extremely surprised that my creature managed to survive more
>>>>>>>>>>>>>> than 30 moves, given a 300 times speed handicap.
>>>>>>>>>>>>>
>>>>>>>>>>>>>The flip side is that the current programs running at some
>>>>>>>>>>>>>future machines at 300 GHz won't be able to crush the current
>>>>>>>>>>>>>programs on 1 GHz any more convincingly (in terms of how
>>>>>>>>>>>>>many moves the slower machine can hang on) than what happened
>>>>>>>>>>>>>in this matchup.
>>>>>>>>>>>>>
>>>>>>>>>>>>>This is the same effect that many players have experienced
>>>>>>>>>>>>>when upgrading their hardware to 2-3 times faster one and
>>>>>>>>>>>>>then being disapponted, after all the expense and hopes,
>>>>>>>>>>>>>when they can't even notice any difference in the perceived
>>>>>>>>>>>>>program strength (aginst humans).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>You are absolutely right.
>>>>>>>>>>>>
>>>>>>>>>>>>I think we are already beginning to experience the effects of dimishing returns
>>>>>>>>>>>>in chess on current hardware at long time controls.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>    Christophe
>>>>>>>>>>>
>>>>>>>>>>>Would someone take the time to explain this simply and clearly, to me.  I can
>>>>>>>>>>>understand that if you are already beating humans (or some other group of
>>>>>>>>>>>players) most of the time, then increasing the speed still means you are beating
>>>>>>>>>>>them most of the time and maybe a bit more, but until a machine can see _all_
>>>>>>>>>>>there is to see how would it not improve by seeing more and how can you say
>>>>>>>>>>>(apriori) that it will improve only at a diminishing return?  In other words, I
>>>>>>>>>>>can believe that results against a set of players is aysomtopic, tending towards
>>>>>>>>>>>100 percent, but do see why this is necessarily true of the game played by two
>>>>>>>>>>>otherwiseequally matched entities.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>In my opinion it has to do with the fact that in a given chess position the
>>>>>>>>>>number of moves is limited. Generally you have between 20 and 50 legal moves.
>>>>>>>>>>
>>>>>>>>>>From these moves, only an even more limited subset does not lead to an obvious
>>>>>>>>>>loss.
>>>>>>>>>>
>>>>>>>>>>And from this subset there is an even more limited subset of moves (2 or 3
>>>>>>>>>>generally) that can be played, and chosing between them is a matter of
>>>>>>>>>>preference because the amount of computation needed to prove which one is better
>>>>>>>>>>is too big for any computer.
>>>>>>>>>>
>>>>>>>>>>So once you reach the stage where you can see which 2 or 3 moves are playable,
>>>>>>>>>>it would take an additional huge computation to see further.
>>>>>>>>>>
>>>>>>>>>>I think some chess programs on current computers at long time controls have
>>>>>>>>>>already reached this stage, and this is why is becomes increasingly difficult to
>>>>>>>>>>say which one is better.
>>>>>>>>>>
>>>>>>>>>>This is a very simplistic explanation which lacks mathematical support, I know,
>>>>>>>>>>but that's how I explain dimishing returns.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    Christophe
>>>>>>>>>
>>>>>>>>>Is it possible that there is also a problem with bad evaluations infecting whole
>>>>>>>>>branches in the tree of analysis? In Fritz vs. Gambit Tiger at Leiden, Fritz
>>>>>>>>>played 21.b4, shutting in its queen. Was this not a dreadful move? Yet, I had
>>>>>>>>>Fritz analyze after this point through 18 ply, and the evaluation was just +0.06
>>>>>>>>>(after which it mysteriously halted analysis). And Tiger 14 has reached 20 ply
>>>>>>>>>looking at this same position, with an evaluation of just +0.46 after 21...Nc3
>>>>>>>>>22.Rd3 Qf6 23.Kf1 Ne4 24.Bd4 Qf7 25.Bb2 g5 26.Rde3 Bf4 27.Bxe4 fxe4 28.Rxe4 Rxe4
>>>>>>>>>29.Rxe4 Qxd5 30.Re7. Actually, the final position is lost for White after
>>>>>>>>>30...Qd3+ 31.Re2 Qb1+ 32.Ne1 Bf5, but White doesn't need to play 30.Re7. The
>>>>>>>>>point is that neither program, given even 10-12 hours to think (on a PIII 850)
>>>>>>>>>appreciates the disastrous effects of White's missing queen. As poor evaluations
>>>>>>>>>like this clog up the search, all lines begin to look like one another, despite
>>>>>>>>>huge differences between them that would be clear to any human player examining
>>>>>>>>>these positions.
>>>>>>>>>Regards,
>>>>>>>>>Vine Smith
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>I do not agree.
>>>>>>>>
>>>>>>>>Tiger KNOWS about the bad position of the Queen after b4 and would never play
>>>>>>>>this move.
>>>>>>>>
>>>>>>>>If you try, you will see that Tiger's evaluation is different in the lines the
>>>>>>>>queen is trapped and in the lines it is not.
>>>>>>>>
>>>>>>>>The evaluation difference is not big, but it is enough to avoid such a
>>>>>>>>disastrous move in almost all the cases, and to try to find a way to free the
>>>>>>>>queen if it happens to be trapped by a long sequence of forced moves.
>>>>>>>>
>>>>>>>>Tiger is able to identify some cases of blocked pieces or pieces with poor
>>>>>>>>mobility in its evaluation. In particular, it is able to see that the queen is
>>>>>>>>blocked after b4? and gives a penalty for this. I have worked hard in this part
>>>>>>>>of the evaluation, so I can't let you generalize and say that any program would
>>>>>>>>ignore the consequences of the trapped queen. Mine knows.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>    Christophe
>>>>>>>Hi --
>>>>>>>First of all, I want to say that I like Tiger 14 and Gambit Tiger 2.0 very much,
>>>>>>>and was quite impressed by the game against Patzer at Leiden. That was some
>>>>>>>terrific chess!
>>>>>>
>>>>>>
>>>>>>Thanks!
>>>>>>
>>>>>>
>>>>>>
>>>>>>>But wouldn't you like to see a better evaluation from Tiger than +0.46 after
>>>>>>>21.b4? That kind of score would make me think (if I didn't see the position),
>>>>>>>"Oh, Tiger must have the two bishops and more space; or the opponent has a
>>>>>>>couple of weak pawns." I would never imagine that instead, the opponent's most
>>>>>>>powerful piece had been locked in by pawns and rendered completely immobile. The
>>>>>>>evaluation leads me to wonder if Tiger would happily win a pawn at the expense
>>>>>>>of liberating the queen, which seems like too low a price for such a gift to the
>>>>>>>opponent.
>>>>>>>I must also point out that 17 ply deep, the evaluation was +0.48, and the line
>>>>>>>shown was 21...Nc3 22.Rd3 Qf6 23.Kf1 Ne4 24.Bd4 Qf7 25.Rde3 Bf4 26.Rd3 c6
>>>>>>>27.dxc6 Bxc6 28.Bb3 Bd5 29.Bxd5 Qxd5 30.Qb6 Qa2. In this line, with 26...c6,
>>>>>>>Tiger releases the queen, and for what? Of course, as it approaches this point
>>>>>>>during actual play, it may change its mind, but its decision about what the
>>>>>>>correct 21st move was for Black (at 17 ply deep) was based on a line in which it
>>>>>>>gratuitously released the queen.
>>>>>>>This was Tiger 14 doing the analysis -- is there any difference from Gambit
>>>>>>>Tiger 2.0 regarding how the trapping penalty is applied?
>>>>>>
>>>>>>
>>>>>>
>>>>>>First I would like to point out that my program would NOT play b4.
>>>>>>
>>>>>>Don't forget that the b4 mistake has been played by Fritz, not by Tiger. So if
>>>>>>you want to blame somebody........ :)
>>>>>>
>>>>>>Then you can argue about the value I'm using for the penalty for a trapped
>>>>>>queen.
>>>>>>
>>>>>>Here is how I set such a penalty value, usually: I set it as low as possible.
>>>>>>High enough so the program understands that there is a problem and does not play
>>>>>>the faulty move, AND low enough so it is not going to interfere in a crazy way
>>>>>>with the program's playing style.
>>>>>>
>>>>>>The idea behind this is that, as the programmer, I cannot think about all the
>>>>>>consequences of such a penalty. When I introduce such a penalty in the
>>>>>>evaluation it's because I have a set of positions where it is supposed to help,
>>>>>>but no set of positions can have a good enough statistical significance. So
>>>>>>there are obviously a lot of cases where the penalty will be counter productive,
>>>>>>and I have no idea of what these positions will be (sure, I will soon discover
>>>>>>some in test games, if I set the penalty too high).
>>>>>>
>>>>>>So you might think that the trapped queen is worth one pawn or more, but still
>>>>>>I'll give it a much lower weight in Tiger's evaluation.
>>>>>>
>>>>>>And you must also take into account the fact that the trapped queen is not only
>>>>>>bad "by itself" (queen mobility = 0), but it is also bad because of the
>>>>>>consequences, and don't forget that the search is going to catch (understand)
>>>>>>some of these consequences (the search might be able see that white is unable to
>>>>>>defend against an attack because the queen cannot come near).
>>>>>>
>>>>>>So usually you do not need to give a really high positional penalty. A little
>>>>>>penalty PLUS the additional positional problems that are going to be found by
>>>>>>the search might come close to the overall penalty you would have given
>>>>>>yourself, as a human player.
>>>>>>
>>>>>>
>>>>>>
>>>>>>    Christophe
>>>>>Hi --
>>>>>
>>>>>I certainly hope I didn't imply that Tiger would play 21.b4. In fact, I checked
>>>>>this right after interrogating Fritz about this move, and Tiger, I believe (I
>>>>>don't have the analysis at hand), chose 21.b3. I don't really understand this
>>>>>choice, either, but many programs go this way, some mentioned by Sune Larsson in
>>>>>an earlier post, and I found that Yace 0.99.01 selected this as well.
>>>>>Your explanation of how the penalty is applied was very helpful in understanding
>>>>>the evaluations. I suppose that if it is set too high, the program would
>>>>>possibly be overly casual about giving up pawns, or even pieces.
>>>>>I do understand that the immobility of the queen is not in and of itself going
>>>>>to win the game -- in fact, Gambit Tiger's approach versus Fritz seems to be
>>>>>quite correct, open up the game, create some weaknesses to attack, and the
>>>>>queen's absence will be keenly felt. Any attempt to win the queen appears to
>>>>>fail, as I have investigated this myself, and also Tiger's 20-ply search after
>>>>>21.b4 strongly implies this is not possible. The fact that the punishment for
>>>>>the crime takes so long makes this an excellent example of long-term positional
>>>>>play.
>>>>>But one last question: pretend that Fritz's 21.b4 is forced, and that I am using
>>>>>Tiger to analyze starting from, let's say Black's move 17, and that besides the
>>>>>course in the game, there is one other significant line, all the others are no
>>>>>good for Black. For the other significant line, there is an evaluation of +0.60,
>>>>>and this is based on "normal" factors, such as pawn structure and superior minor
>>>>>pieces. So when I run an 18-ply analysis from move 17, attempting to learn the
>>>>>truth of the position, won't Tiger show me the other significant line, rather
>>>>>than the more favorable queen-trap line due to the "artificially" low penalty?
>>>>>And given the choice, would Tiger steer towards the hypothetical other line, or
>>>>>does it have some way of recognizing that the numerically lower evaluation
>>>>>actually represents a more favorable position?
>>>>>Thanks for taking the time to discuss all of this with me.
>>>>>Regards,
>>>>>Vine
>>>>
>>>>
>>>>
>>>>Only numerical factors are taken into account, so if there is a line scoring
>>>>+0.48 and another scoring +0.60, Tiger will go for +0.60.
>>>>
>>>>I accept the responsability for chosing low weights for such penalties, even if
>>>>it is possible to show that in *some* positions the penalty should have been set
>>>>at a higher value. For any of such positions I'll be able to find another
>>>>position where my low penalty is already too much and is causing a disaster.
>>>>
>>>>All that counts is the results in real life, which can be measured only by
>>>>estimating the overall strength of the program, not individual positions.
>>>>
>>>>That's why I wouldn't blame Frans too much either for chosing to ignore some
>>>>parameters as trapped queens or king safety. The strength of his program
>>>>advocates for his choices, and it is the duty of other programmers to prove he's
>>>>wrong, if they can.
>>>>
>>>>
>>>>
>>>>    Christophe
>>>Definitely, your approach seems very reasonable, and the practicalities of chess
>>>programming require such decisions. But going back to the original point, where
>>>there was speculation about the future 300 GHz machines, would this not imply
>>>that the additional plies of search might be quite useless in many cases? As the
>>>number of nodes increases, so do the possibilities of running into one of these
>>>"exception" positions, where the evaluation does not match the position's true
>>>potential.
>>
>>
>>
>>No, this is not what happens.
>>
>>These errors happen all the time and do not have a disastrous impact.
>>
>>Search beyond the positions where positional misunderstanding happens corrects
>>the positional mistakes.
>>
>>As you have noticed, the last plies of the best lines displayed by a computer
>>are often full of positional mistakes, but the first plies are much more
>>accurate.
>>
>>A deep search will show mistakes in the last few plies, but a majority of the
>>first plies is going to be accurate.
>>
>>Increasing the number of positions searched indeed increases the number of
>>positions where the evaluation is wrong, but it does not increase the PROPORTION
>>of these particular positions. So their impact does not increase, and deeper
>>searches still produce better results.
>>
>>
>>
>>
>>> Eventually, the possibility will approach 100%, and the evaluations
>>>will blur into a continuum where tactically sound lines all appear roughly
>>>equivalent to each other, separated by differences of hundredths of a pawn in
>>>not entirely meaningful fashion.
>>>I think if Kramnik considers the type of weakness Fritz displayed in the game
>>>against Gambit Tiger, he can steer play towards positions requiring long-term
>>>decisions and absolutely dominate even against Deep Fritz on an 8-processor
>>>machine. The flaw is more general than trapped pieces, it extends to any
>>>long-term factor that cannot be given a "proper" score due to its short-term
>>>effects on the program's play. But I don't "blame" Fritz, or its programmers, I
>>>just hope that someday a solution will be found to handle these exception cases.
>>
>>
>>
>>I'm not sure it's an important problem to focus on.
>>
>>It's what you want to see from a human point of view because it would please
>>your sense of beauty, but nothing proves that it really improves the playing
>>strength.
>>
>>I have added knowledge about trapped pieces because it was a mean to improve the
>>playing strength of my program, but I'm not ready to increase the penalty given
>>to trapped pieces to the value a human player would evaluate them just for the
>>beauty of it.
>>
>>
>>
>>    Christophe
>Okay, at this point I give up and just wish you and the twin Tigers the best of
>luck at future tournaments!


Thanks!



> Thanks again for your input.


See you...



    Christophe
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.