Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Fritz7 vs Crafty 18.12

Author: Christophe Theron

Date: 09:11:04 12/18/01

Go up one level in this thread


On December 18, 2001 at 11:08:47, José Carlos wrote:

>On December 18, 2001 at 10:19:05, pavel wrote:
>
>>On December 18, 2001 at 09:52:49, José Carlos wrote:
>>
>>>On December 18, 2001 at 08:20:52, pavel wrote:
>>>
>>>>(Arguably) The strongest commercial chess program vs (Arguably) the strongest
>>>>freeware chess program, in a very arguable matchup.
>>>>
>>>>;)
>>>>
>>>>--------------------------
>>>>Book = 2600.ctg
>>>>Hash = 50mb both
>>>>TB = none.
>>>>Time Control = 5min/side
>>>>Ponder = off
>>>>Hardware= Pentium III/ 512mb ram.
>>>>OS = Windows 2000 Pro.
>>>>---------------------------
>>>>
>>>>
>>>>
>>>>  Program             Elo    +   -   Games   Score   Av.Op.  Draws
>>>>
>>>>  1 Fritz 7         : 2580   36  58   200    71.5 %   2420   21.0 %
>>>>  2 Crafty 18.12    : 2420   58  36   200    28.5 %   2580   21.0 %
>>>>
>>>>
>>>>
>>>>Individual statistics:
>>>>
>>>>(1) Fritz 7                   : 200 (+122,= 42,- 36), 71.5 %
>>>>
>>>>Crafty 18.12                  : 200 (+122,= 42,- 36), 71.5 %
>>>>
>>>>
>>>>(2) Crafty 18.12              : 200 (+ 36,= 42,-122), 28.5 %
>>>>
>>>>Fritz 7                       : 200 (+ 36,= 42,-122), 28.5 %
>>>>
>>>>
>>>>
>>>>The differance between freeware chessprograms and commecial programs seems to be
>>>>just going bigger. Ok, Ok probably this result doesnt say much but, I am sure
>>>>this is the case. Or is it that arguable?
>>>>
>>>>
>>>>Have fun,
>>>>Pavs.
>>>
>>>  Let's be scientific. Your test shows:
>>>
>>>  Fritz 7 + 2600.ctg
>>>  seems stronger at 5 min/game in a PIII unknown mhz + ponder off than
>>>  Crafty 18.12 + 2600.ctg
>>>  with a certain degree of confidence given by the number (200) of games.
>>>
>>>  Neither program use their default book. The time/move is unknown since we
>>>don't know the clock speed. Ponder is off which is not a default setting.
>>>
>>>  I don't know your test is worthless, don't get me wrong. I only say it does
>
>  Sorry here, my horrible english... I meant: "I don't mean your test is
>                                                       ^^^^
>worthless, don't get me wrong."
>
>>>not prove anything but the above stated. Nothing about commercials or amateurs;
>>>fritz or crafty; fritz or crafty + default settings; and so on...
>>>
>>>  José C.
>>
>>
>>oh yeah ofcourse I forgot to put, Pentium III 1Ghz.
>>
>>Even though I am not going to try to say that my test is the best. but probably
>>is not worthless either.
>
>  As I say above, I don't think it either.
>
>>1) POnder off is a default setting  under CB interface. Since both programs are
>>not pondering, I dont see a problem
>
>  Problem is that Bob has stated many times 'his default' is ponder on. So
>ponder off is not default for crafty, so in some way, it hurts its strength.
>
>>2)Both program used same opening book from a well-known set of pgn file. If
>>there is anything wrong with the opening book, both program will suffer. As the
>>opening is reversed in every game. IMO the strength of the program doesnt
>>include opening book, opening book is a way to increase the strenght of a
>>program.
>
>  This has been discussed many times, so maybe I should bring it up again but I
>can't resist :)
>  The book is part of the program. Different books make the program play
>different positions. If you use a book with very positional lines in a Hiarcs -
>GT match it will probably benefit Hiarcs. If you use a wild book, it will
>probably be better for GT.
>  In both cases the book is the same for both programs, but the result is quite
>different.
>  The book, as the rest of the program, has a 'style'. For example, I'm working
>on a tournament book for my program for several months. I don't only chose
>'correct' lines, but lines where my program play correctly. I've found many pawn
>sacs in GM's games that make my program instantly show -0.90. I don't want such
>lines in my book even if they're correct... but GT would probably love them...
>
>>It is a well-known fact in this forum, that you can never be perfect in a
>>eng-eng match. No matter how many games you play or whatever precautions you
>>take.
>
>  Sure. And I have no problem about it, since it happens to all of us. I only
>have 'problems' (not really problems... simply I disagree) with incorrect claims
>about the meaning of the matches.
>
>>Even though the games were just fun, i was just trying to get some meaning out
>>of it.
>
>  Yep, that's the problem. Getting meaning out of games is difficult and
>'dangerous'.
>  I'll tell you a little story: when I first read a post of Christophe claiming
>that a lost games is worthless for him I thought his was just disappointed for
>losing. Later, I rewrote his words many times and understood his point. The
>point is: you modify something in your program in order to get a better
>performance over a lot of games, not in order to correct something in
>particular. That way, you get better for sure. So, it's something like a
>'quantum chess'. A single game (particle) means nothing. It's a big number of
>them that make sense. And not only that: the meaning depends on the
>'circumstances' how the games were played.
>
>
>>regards
>>;)
>>pavs
>
>  Regards,
>
>  José C.



Let's put it with some tact:

1) if a program needs a special book to play correctly and cannot stand to use
the same book as its opponent, then it SUCKS.

2) if a program is not able to perform well with ponder=off when its opponent is
also ponder=off, it SUCKS.

I do not think 1 and 2 apply to Crafty.

No matter what Bob tells, I have yet to see any proof that Crafty is handicapped
by ponder=off. As far as I remember, results have shown that Crafty does not
perform worse in ponder=on than in ponder=off matches.

And this apply to most if not all chess engines.

I also do not see any reason to believe that Crafty is more handicapped than
Fritz by a book that has not been designed specially for it.

I could even say that a commercial program, which is supposed to be helped by a
hand tuned book, should be the most handicapped of the two.


I find Pavel's experiment interesting and I think it tells a lot about the
respective strength of Crafty and Fritz. I'm pretty sure additional experiments
will confirm this result, independantly of the time controls and book, and
ponder setting used.

Those who reject the result do it for very strange reasons. Actually I think
they would reject the result of any experiment. In this world you need to be
able to draw conclusions (including margin of error in your conclusion) from an
unperfect experiment setup, using your own experience and understanding of the
experiment field.



    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.