Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Fruit 2.2 vs Toga II

Author: Stephen Ham

Date: 08:55:18 10/19/05

Go up one level in this thread


"snipped"


>>Dear Gents,
>>
>>I'm testing Toga II, Shredder 9 and Fruit 2.2 at home. Based upon my very long
>>time-control tournaments and matches, and test positions from my correspondence
>>games, Toga II is stronger than Fruit 2.1, but not as strong as Fruit 2.2 or
>>Shredder. So this confirms what Dann wrote.
>>
>>Regarding style, I see that Toga II and Shredder 9 find tactical shots fastest
>>and they accordingly play in a more aggressive style than Fruit 2.2.
>>
>>I'm really impressed with Fruit 2.2. It will indeed find tactical shots, but
>>takes longer to find them than Shredder 9 or Toga II. Fruit 2.2 plays in a
>>steady but generally straight-foward style. While it's not "positional", it's
>>certainly not naturally agressive. So if given enough time to find tactical
>>shots, it will play them.
>>
>>In the endgame, I'm impressed with Fruit 2.2. Again, while it doesn't have a
>>dramatic/dynamic style of play in the middle game (it's a little dull), it also
>>seems to play a relatively risk avoiding endgame. Regardless, it's an effective
>>and efficient player, even if its moves seem dull.
>>
>>Shredder 9 is also a strong endgame player and has some advantage due to EGTBs.
>>But Shredder has a more dynamic syle in the endgame too. Generally, these
>>engines are probably equally good endgame players, but still quite different
>>stylistically. I'm investigating a position now where neither Shredder nor Fruit
>>2.2 select the same candidate moves, and their solutions are entirely different.
>>I'm giving each of them 24-hours to examine the position. Then I'll give each of
>>the engines the PV from the other, to see if they find each other's solution,
>>and how long they take.
>>
>>For me, the biggest advantage of Fruit 2.2 over Shredder 9 is that Shredder has
>>a very "optimistic" (read: wildly inaccurate) evaluation function. Dead equal
>>positions are sometimes assessed by Shredder as being wins. And Shredder will
>>flip back and forth in dynamic positions regarding who has an advantage. Fruit
>>2.2, however, has a relatively accurate evaluation, superior to Toga's as well.
>>
>>Since Fruit 2.2 isn't a naturally dynamic player, some claim that it's
>>positional. I don't find that to be true at all. Instead, it's just rock-solid
>>with no clear weaknesses. I think Hiarcs 9 and Pro-Deo are the best positional
>>engines, IMHO, but they get outsearched in dynamic positions.
>>
>>That said, I've played long time-control matches and games with the three, being
>>careful to follow Uri's advice to add other engines to the mix. So I added
>>Hiarcs 9 and Junior 9. While each CB engine has its own book, I have none for
>>Toga II and Fruit 2.2 when playing in a CB GUI. So I gave Toga II a book a
>>created from my correspondence games, including my TNs and my private analysis.
>>Fruit has played with either a "solid" Fritz 7 book or Nimzo 7.32 (I'm trying to
>>find which is more compatible).
>>
>>In general, the results show Shredder 9 and Fruit 2.2 on top, with a very slight
>>edge to Shredder. Toga II is often close behind, while Hiarcs and Junior always
>>finish on the bottom. I think Junior is hurt by a bad book that I manually fix
>>and update after each loss.
>>
>>The neat thing about Shredder 9 and Fruit 2.2 is they are the only engines I've
>>seen that can still win from bad positions by outplaying Junior and Hiarcs.
>>Sure, other strong engines can outplay and defeat weaker engines from bad
>>positions. But Junior 9 and Hiarcs 9 are already very strong. But sometimes,
>>Fruit or Shredder will make one bad move that gets them into trouble. But they
>>sometimes are able to still win from bad positions - which I find impressive.
>>
>>All the best,
>>
>>Steve
>
>Hi Steve,
>
>I, too, have been running some long time control games to see which engine is
>actually the best in this time zone that is really untested. So much emphasis is
>placed on blitz that it skews everything we know about engine strength. I would
>agree with what you say except I haven't been able to test Fruit 2.2 (maybe ICD
>will sell it, in which case I'd be more comfortable giving up a credit card
>number!?).
>
>So far I haven't found an engine that significantly beats Shredder 9,  except as
>you mentioned with Shredder's flailing evaluation. I've watched if frequently
>switch back and forth between 2 or 3 moves on some positions where it continues
>to score each move 0.01 better at each ply only to flip back the next ply. In a
>few cases, Hiarcs has demonstrated some unusual ability to solve strategic
>positions, but it hasn't occurred enough to make a case for it.
>
>One thing I have done for the opening book is to compile a book where all the
>source games were by played by 2600+ or 2700+ players on both sides. The 2700+
>book is too small though to be fair. I continue to let Hiarcs use its own book
>though, as the engine seems tuned for certain positions and it seems fair to
>allow it to play for those types of positions.
>
>Another issue for me has been time control. In order to take time management out
>of the equation, preferred since I want to focus on chess strength, I tried
>playing games at 30' per move. But this time control doesn't work well with the
>GUI's and if you allow for book moves you wind up with an great variance on the
>time each position is analyzed. I would also say that "pondering" is a non-issue
>and not necessary, but that large hash tables are more important. Lately, I've
>played a few games on two computers at "game in 360'", which forces the engine
>to completely manage time. I wonder how you're handling the time controls as
>we'd clearly like to have a level playing field?
>
>Regards,
>
>Chuck

Hi Chuck,

First of all, I must provide you with a caveat. I'm a computer dummy. So, what
little I've actually learned has come from reading about chess engines from the
experts who post here at CCC.

That said, I too think that time management can be a factor, especially at fast
time controls. But I'm concerned about pure calculating ability, and the
capacity to consistently find strong moves. So, I play matches and tournaments
over extra-long time controls and steer totally clear of short time controls.

I believe that the programmers probably placed greater emphasis on time
management in games that mimic real tournament conditions, such as 40/120 with
normal tournament time controls thereafter. That said, I question whether
game/40' or game/150' produces accurate results regarding engine move generation
performance. I doubt it. Without a secondary and tertiary time control, the
engine may not know how to manage time in a complex position since it doesn't
have a "crystal ball" to know how many moves remain in the game. Hence, it
doesn't know how to allocate time at a crucial juncture.

But with a known time control at 40-moves, then the engine "knows" that it can
allocate a certain amount of time to solving the position on the board since it
only needs to make X-moves to reach time control. For example, suppose a highly
complex and dynamic position exists on the board at move 31. In a game/150', the
engine can't know how to properly use its time and so will generate a move after
a possibly arbitrary time allocation. But in a 40/120' time control, with
reasonable secondary time-controls (e.g. 20/60') and tertiary (game/30'), the
engine knows best how to allocate time.

In the above scenario, the engine knows it merely needs to make 10-moves more to
reach the first time control. So, it has the potential to allocate a large block
of its remaining time on move 31 in order to try to best "solve" the position.

But, just to ensure that I'm giving the engines sufficient time to perform, my
time controls usually exceed 40/150', 20/75', game/30'. I play on a very fast
AMD with large RAM allocation.

Regarding opening books, I always mate the engine's book to the engine for CB
engines. But if one book runs out of moves while the other books continue, then
I'll supplement that book in post-game analysis to try to even the playing field
for the next test. Also, some books are just plain inferior (Junior 9 and Hiarcs
9 are examples) and so I've corrected the lines in post-game analysis or steered
the engines to new lines to again try to level the playing field.

But I'm not 100% objective regarding opening book allocation, since I have a
personal interest in this. I'm a competitive correspondence chess player. As
such, I have lots of my own opening ideas (TNs) and preferences that I want to
test for objectivity. So I've modified the Shredder 7 opening book to play only
"my stuff" in order to test it. But I don't want to give that book to my
strongest engine, since I'll never know if the good results were due to the book
or the engine. So I've given the book to Toga II. I don't know it that's an
ideal match for Toga II, since I'm not a tactically sharp player (I'm more
technical - positional). But I enjoy this since Toga II doesn't have a native
book available for the CB GUI, and I get to test engines and to see how my ideas
work in "objective" competition. If my opening book busts a line that another
engine plays, then I'll fix the book of the other engine so the playing field is
potentially leveled for future play.

So that's how I test. It's an on-going and never ending series of games. It may
not be 100% scientific, but I think I now have a pretty good idea how the
engines perform. And I'm a relatively strong chess player myself, so I play
through each and every game just to gauge the performances of the engines and
make mental notes of strengths/weaknesses. When I see interesting positions,
then I'll contrast and compare engines by allowing all of them (e.g. Shredder 9,
Toga II, Fruit 2.2, etc.) a chance to handle that position. The results of these
tests interest me.

Chuck, you wrote of an "...engine that significantly beats Shredder 9." That
doesn't exist yet. Sometimes Fruit 2.2 wins my tournaments, but it does so by
scoring better against the other engines, rather than clearly besting Shredder 9
in 1:1 competition. I've also had several engine matches against Shredder 9.
Sometimes it wins and sometimes it doesn't. But when it doesn't win, the margins
are slight. So no engine consistently beats Shredder 9. And when Shredder 9 does
lose, it's often due to one bad move selection, because it's generally a very
reliable all-around performer in those games.

My guess is that while Shredder 9 and Fruit 2.2 have completely different
styles, they're of approximately the same strength overall. I think Shredder 9's
evaluation function is terrible, and this sometimes causes variability in its
performance (ranging from exciting and brilliant, to a bad move selection or
two, per game). Fruit 2.2 just plays rock-solid chess with less performance
variability. Yes, it may not elect to play the sharpest lines, but nobody can
argue with its results. So when Shredder 9 loses, it's often that Shredder 9
"shot itself in the foot" with one or two bad move selections, rather than being
outplayed throughout the game. And I've seen Shredder 9 come back from bad
positions and still win, which I find impressive.

Conversely, I've seen Hiarcs 9 and Junior 9 carry attractive positions into the
late middle-game and/or endgames, only to get outplayed by Shredder 9 and and
Fruit 2.2 thereafter.

One odd note - I've tested Fruit 2.2 in games where I've forced it to play the
Modern Defense as Black. While these lines are objectively inferior for Black
IMHO, Fruit 2.2 has scored exceptionally well. Yes, it often evaluates its
position as slightly inferior too, but then proceeds to dominate the game all
the way to victory. So, I encourage others to test Fruit 2.2 in dynamic
hypermodern openings, rather than classical pawn center openings.

Hopefully I've answered all your questions, Chuck.

All the best,

Steve



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.