Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Fritz7 vs Crafty 18.12

Author: José Carlos

Date: 09:20:49 12/18/01

Go up one level in this thread


On December 18, 2001 at 11:38:53, pavel wrote:

>On December 18, 2001 at 11:08:47, José Carlos wrote:
>
>>On December 18, 2001 at 10:19:05, pavel wrote:
>>
>>>On December 18, 2001 at 09:52:49, José Carlos wrote:
>>>
>>>>On December 18, 2001 at 08:20:52, pavel wrote:
>>>>
>>>>>(Arguably) The strongest commercial chess program vs (Arguably) the strongest
>>>>>freeware chess program, in a very arguable matchup.
>>>>>
>>>>>;)
>>>>>
>>>>>--------------------------
>>>>>Book = 2600.ctg
>>>>>Hash = 50mb both
>>>>>TB = none.
>>>>>Time Control = 5min/side
>>>>>Ponder = off
>>>>>Hardware= Pentium III/ 512mb ram.
>>>>>OS = Windows 2000 Pro.
>>>>>---------------------------
>>>>>
>>>>>
>>>>>
>>>>>  Program             Elo    +   -   Games   Score   Av.Op.  Draws
>>>>>
>>>>>  1 Fritz 7         : 2580   36  58   200    71.5 %   2420   21.0 %
>>>>>  2 Crafty 18.12    : 2420   58  36   200    28.5 %   2580   21.0 %
>>>>>
>>>>>
>>>>>
>>>>>Individual statistics:
>>>>>
>>>>>(1) Fritz 7                   : 200 (+122,= 42,- 36), 71.5 %
>>>>>
>>>>>Crafty 18.12                  : 200 (+122,= 42,- 36), 71.5 %
>>>>>
>>>>>
>>>>>(2) Crafty 18.12              : 200 (+ 36,= 42,-122), 28.5 %
>>>>>
>>>>>Fritz 7                       : 200 (+ 36,= 42,-122), 28.5 %
>>>>>
>>>>>
>>>>>
>>>>>The differance between freeware chessprograms and commecial programs seems to be
>>>>>just going bigger. Ok, Ok probably this result doesnt say much but, I am sure
>>>>>this is the case. Or is it that arguable?
>>>>>
>>>>>
>>>>>Have fun,
>>>>>Pavs.
>>>>
>>>>  Let's be scientific. Your test shows:
>>>>
>>>>  Fritz 7 + 2600.ctg
>>>>  seems stronger at 5 min/game in a PIII unknown mhz + ponder off than
>>>>  Crafty 18.12 + 2600.ctg
>>>>  with a certain degree of confidence given by the number (200) of games.
>>>>
>>>>  Neither program use their default book. The time/move is unknown since we
>>>>don't know the clock speed. Ponder is off which is not a default setting.
>>>>
>>>>  I don't know your test is worthless, don't get me wrong. I only say it does
>>
>>  Sorry here, my horrible english... I meant: "I don't mean your test is
>>                                                       ^^^^
>>worthless, don't get me wrong."
>>
>>>>not prove anything but the above stated. Nothing about commercials or amateurs;
>>>>fritz or crafty; fritz or crafty + default settings; and so on...
>>>>
>>>>  José C.
>>>
>>>
>>>oh yeah ofcourse I forgot to put, Pentium III 1Ghz.
>>>
>>>Even though I am not going to try to say that my test is the best. but probably
>>>is not worthless either.
>>
>>  As I say above, I don't think it either.
>>
>>>1) POnder off is a default setting  under CB interface. Since both programs are
>>>not pondering, I dont see a problem
>>
>>  Problem is that Bob has stated many times 'his default' is ponder on. So
>>ponder off is not default for crafty, so in some way, it hurts its strength.
>
>Yes I have seen it discussed here many times before. And to be honest, I was
>against, crafty playing with ponder=off, for a long time. But the problem is
>there is no valid data (even though I understand the author insists) to prove
>that it effects the playing strength of the program.
>
>
>
>>
>>>2)Both program used same opening book from a well-known set of pgn file. If
>>>there is anything wrong with the opening book, both program will suffer. As the
>>>opening is reversed in every game. IMO the strength of the program doesnt
>>>include opening book, opening book is a way to increase the strenght of a
>>>program.
>>
>>  This has been discussed many times, so maybe I should bring it up again but I
>>can't resist :)
>>  The book is part of the program. Different books make the program play
>>different positions. If you use a book with very positional lines in a Hiarcs -
>>GT match it will probably benefit Hiarcs. If you use a wild book, it will
>>probably be better for GT.
>>  In both cases the book is the same for both programs, but the result is quite
>>different.
>>  The book, as the rest of the program, has a 'style'. For example, I'm working
>>on a tournament book for my program for several months. I don't only chose
>>'correct' lines, but lines where my program play correctly. I've found many pawn
>>sacs in GM's games that make my program instantly show -0.90. I don't want such
>>lines in my book even if they're correct... but GT would probably love them...
>
>
>Interesting, I agree that the book does define the playing strength of a
>program. But the question in hand is, if two programs not playing with their
>default book, shouldnt that effect both the program playing strength?
>Considering the fact the opening is reversed for both the programs.

  Maybe I didn't make my point clear. A 'general' book might be:
  - very similar to one of the opponents real book, so benefit it
  - far different from both opponents real books, but close in style to one of
the programs
  - have deep lines that go right to an endgame in most games, and so benefit
the program that plays endgame best
  - have short and speculative lines that benefit the most tactical program
  - ...

  Such a book might be fair, or might not. It's very difficult to predict. But
if you use the book the author suggest, if it is a wrong book, it's not your
fault, but the programmer's.

>>
>>>It is a well-known fact in this forum, that you can never be perfect in a
>>>eng-eng match. No matter how many games you play or whatever precautions you
>>>take.
>>
>>  Sure. And I have no problem about it, since it happens to all of us. I only
>>have 'problems' (not really problems... simply I disagree) with incorrect claims
>>about the meaning of the matches.
>>
>>>Even though the games were just fun, i was just trying to get some meaning out
>>>of it.
>>
>>  Yep, that's the problem. Getting meaning out of games is difficult and
>>'dangerous'.
>>  I'll tell you a little story: when I first read a post of Christophe claiming
>>that a lost games is worthless for him I thought his was just disappointed for
>>losing. Later, I rewrote his words many times and understood his point. The
>>point is: you modify something in your program in order to get a better
>>performance over a lot of games, not in order to correct something in
>>particular. That way, you get better for sure. So, it's something like a
>>'quantum chess'. A single game (particle) means nothing. It's a big number of
>>them that make sense. And not only that: the meaning depends on the
>>'circumstances' how the games were played.
>>
>>
>>>regards
>>>;)
>>>pavs
>
>What do you suggest would be a more "fair" match up between these 2 programs?

  First, I'm not talking of "fair", just to make the right conclusions out of
the test. BTW, I don't believe there exists anything "fair" in life ... :(

>I will state few ideas, pls let me know what you think
>
>1) Default Opening Book for the programs.

  That's usually a good idea to test "the program". But you can also test
program A + book X, and say "combination A+X performs this way:..."

>(problem here, does crafty have a default opening book? the one thats available
>for download, is being used for a lot of versions, and not sure if it has been
>tweaked for craftys style of play.

  It's up to Bob to answer this question, I think.

>Even though lets say we want to use that
>book, we cant use it under CB interface AFAIK. So we will have to use crafty
>under winboard in one computer and make it play against fritz7 with its own GUI
>on another computer.)

  I had an old version of Fritz (not sure the number, I don't use it anymore),
and I remember I created a void book with no moves, attached it to winboard
programs and they used their own books.
  Also, you can download the pgn Bob used to compile the book and compile a
Fritz ctg or cbh or whatever it's called.

>2) Ponder=on

  Again, you can use ponder off and report "crafty + ponder off performs..." but
not just say "crafty performs...".

>3) 2 Identical CPUs
>
>(thats an obstacle I cant overcome, gotta buy another computer.)

  I only have one computer and test with ponder off. I wish I had more money...
:)

>4) Tournament Time Control

  Again, if you state "program A performs ... at time control xxx on a zzz
machine" it's just fine :)

>5) reasonable amount of game, 500. IMO thats reasonable enough.

  Yep.

  Regards,

  José C.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.