Author: Uri Blass
Date: 13:29:47 08/17/02
Go up one level in this thread
On August 17, 2002 at 14:40:28, Vincent Diepeveen wrote: >On August 17, 2002 at 09:53:25, Uri Blass wrote: > >both have valid points, but the main thing is not real clear to >most of the people and that this testset is very weird in this >that the worst positional middlegame program - Fritz 7, is number >1 at the middlegame positions and that the worst endgame program, >Gambit Tiger 1.0, is having the highest score on endgame problems. > >So the testset is a bit funny. > >The reasons for this is because all the vaste majority of >positoins are requiring 'patzermoves'. In case of middlegames the >vaste majority of positions require aggressive handling of the >king safety of the opponent. Something Fritz is doing well. > >In case of endgames, huge scores for obvious advantages solves anything >there. See why Tiger doing so well there. Just don't care about that pawn >and promote your passer! Give 3 pawns bonus for pawns on the 3d and even >more for on the 2nd rank. > >Things like that. The testset is measuring the aggressiveness of engines. >It's not indicating how strong engines are at all. > >For example the number of queenside positions is very little. You can >count them on 1 hand and even then those positions require a patzer >move. > >The other problem, that there are only 'best moves' to find and no >'avoid moves', is a thing which is also true. That's nevertheless not >such a major problem as the patzer problem. > >How can an engine that doesn't know a thing from bishop versus knight, >when compared to the other top engines, be on #1 at the positional testset? > >These guys make of course the same mistakes like others. >We have seen this before of course. The GS2930 testset is a good example >of another testset where just giving away a pawn for 2 checks is giving >a high score for the engine in question. > >That's really showing a problem in all these testsets. In that respect >even the hardest work is useless of course. > >This testset when i saw it, i was happy that it had new positions, but >the claims of the authors is not realistic. > >The testset measures how well aggressive the programs are at the positions >in the testset. They measure nothing that has to do with the real strength >of engines. I prefer a test suite that is simply based on mistakes of computers in comp-comp games. No knowledge in chess is needed to compose the test. People can take yace and search for tactical mistakes in comp-comp games by computer analysis. I suggest to use yace for the analysis because it is not a root processor and can learn from previous search. people can give it to analyze for one hour every interesting position(not more than 2 pawns for one side after a short search) and look for positions when the change in the evaluation suggests that the move is a tactical mistake) They can later check it by deep analysis with other programs and if other programs also agree that the move was a mistake the position can be in the test suite. I can give one example from a game of movei Bestia0.88-Movei(3th division) r6k/5r1p/1p2pP2/3p1p2/p1bB1P2/4P3/PP4RP/R5K1 b - - 0 25 am b5 Movei played here b5 and lost the game. Black's position is not good but b5 is losing immediatly after Kf2 when I believe that things are less simple after moves like Rg8 Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.