Author: CLiebert
Date: 08:00:36 12/19/01
Go up one level in this thread
On December 18, 2001 at 12:11:04, Christophe Theron wrote: >On December 18, 2001 at 11:08:47, José Carlos wrote: > >>On December 18, 2001 at 10:19:05, pavel wrote: >> >>>On December 18, 2001 at 09:52:49, José Carlos wrote: >>> >>>>On December 18, 2001 at 08:20:52, pavel wrote: >>>> >>>>>(Arguably) The strongest commercial chess program vs (Arguably) the strongest >>>>>freeware chess program, in a very arguable matchup. >>>>> >>>>>;) >>>>> >>>>>-------------------------- >>>>>Book = 2600.ctg >>>>>Hash = 50mb both >>>>>TB = none. >>>>>Time Control = 5min/side >>>>>Ponder = off >>>>>Hardware= Pentium III/ 512mb ram. >>>>>OS = Windows 2000 Pro. >>>>>--------------------------- >>>>> >>>>> >>>>> >>>>> Program Elo + - Games Score Av.Op. Draws >>>>> >>>>> 1 Fritz 7 : 2580 36 58 200 71.5 % 2420 21.0 % >>>>> 2 Crafty 18.12 : 2420 58 36 200 28.5 % 2580 21.0 % >>>>> >>>>> >>>>> >>>>>Individual statistics: >>>>> >>>>>(1) Fritz 7 : 200 (+122,= 42,- 36), 71.5 % >>>>> >>>>>Crafty 18.12 : 200 (+122,= 42,- 36), 71.5 % >>>>> >>>>> >>>>>(2) Crafty 18.12 : 200 (+ 36,= 42,-122), 28.5 % >>>>> >>>>>Fritz 7 : 200 (+ 36,= 42,-122), 28.5 % >>>>> >>>>> >>>>> >>>>>The differance between freeware chessprograms and commecial programs seems to be >>>>>just going bigger. Ok, Ok probably this result doesnt say much but, I am sure >>>>>this is the case. Or is it that arguable? >>>>> >>>>> >>>>>Have fun, >>>>>Pavs. >>>> >>>> Let's be scientific. Your test shows: >>>> >>>> Fritz 7 + 2600.ctg >>>> seems stronger at 5 min/game in a PIII unknown mhz + ponder off than >>>> Crafty 18.12 + 2600.ctg >>>> with a certain degree of confidence given by the number (200) of games. >>>> >>>> Neither program use their default book. The time/move is unknown since we >>>>don't know the clock speed. Ponder is off which is not a default setting. >>>> >>>> I don't know your test is worthless, don't get me wrong. I only say it does >> >> Sorry here, my horrible english... I meant: "I don't mean your test is >> ^^^^ >>worthless, don't get me wrong." >> >>>>not prove anything but the above stated. Nothing about commercials or amateurs; >>>>fritz or crafty; fritz or crafty + default settings; and so on... >>>> >>>> José C. >>> >>> >>>oh yeah ofcourse I forgot to put, Pentium III 1Ghz. >>> >>>Even though I am not going to try to say that my test is the best. but probably >>>is not worthless either. >> >> As I say above, I don't think it either. >> >>>1) POnder off is a default setting under CB interface. Since both programs are >>>not pondering, I dont see a problem >> >> Problem is that Bob has stated many times 'his default' is ponder on. So >>ponder off is not default for crafty, so in some way, it hurts its strength. >> >>>2)Both program used same opening book from a well-known set of pgn file. If >>>there is anything wrong with the opening book, both program will suffer. As the >>>opening is reversed in every game. IMO the strength of the program doesnt >>>include opening book, opening book is a way to increase the strenght of a >>>program. >> >> This has been discussed many times, so maybe I should bring it up again but I >>can't resist :) >> The book is part of the program. Different books make the program play >>different positions. If you use a book with very positional lines in a Hiarcs - >>GT match it will probably benefit Hiarcs. If you use a wild book, it will >>probably be better for GT. >> In both cases the book is the same for both programs, but the result is quite >>different. >> The book, as the rest of the program, has a 'style'. For example, I'm working >>on a tournament book for my program for several months. I don't only chose >>'correct' lines, but lines where my program play correctly. I've found many pawn >>sacs in GM's games that make my program instantly show -0.90. I don't want such >>lines in my book even if they're correct... but GT would probably love them... >> >>>It is a well-known fact in this forum, that you can never be perfect in a >>>eng-eng match. No matter how many games you play or whatever precautions you >>>take. >> >> Sure. And I have no problem about it, since it happens to all of us. I only >>have 'problems' (not really problems... simply I disagree) with incorrect claims >>about the meaning of the matches. >> >>>Even though the games were just fun, i was just trying to get some meaning out >>>of it. >> >> Yep, that's the problem. Getting meaning out of games is difficult and >>'dangerous'. >> I'll tell you a little story: when I first read a post of Christophe claiming >>that a lost games is worthless for him I thought his was just disappointed for >>losing. Later, I rewrote his words many times and understood his point. The >>point is: you modify something in your program in order to get a better >>performance over a lot of games, not in order to correct something in >>particular. That way, you get better for sure. So, it's something like a >>'quantum chess'. A single game (particle) means nothing. It's a big number of >>them that make sense. And not only that: the meaning depends on the >>'circumstances' how the games were played. >> >> >>>regards >>>;) >>>pavs >> >> Regards, >> >> José C. > > > >Let's put it with some tact: > >1) if a program needs a special book to play correctly and cannot stand to use >the same book as its opponent, then it SUCKS. > >2) if a program is not able to perform well with ponder=off when its opponent is >also ponder=off, it SUCKS. > >I do not think 1 and 2 apply to Crafty. > >No matter what Bob tells, I have yet to see any proof that Crafty is handicapped >by ponder=off. As far as I remember, results have shown that Crafty does not >perform worse in ponder=on than in ponder=off matches. > >And this apply to most if not all chess engines. > >I also do not see any reason to believe that Crafty is more handicapped than >Fritz by a book that has not been designed specially for it. > >I could even say that a commercial program, which is supposed to be helped by a >hand tuned book, should be the most handicapped of the two. > > >I find Pavel's experiment interesting and I think it tells a lot about the >respective strength of Crafty and Fritz. I'm pretty sure additional experiments >will confirm this result, independantly of the time controls and book, and >ponder setting used. > >Those who reject the result do it for very strange reasons. Actually I think >they would reject the result of any experiment. In this world you need to be >able to draw conclusions (including margin of error in your conclusion) from an >unperfect experiment setup, using your own experience and understanding of the >experiment field. > > > > Christophe In summary I would agree to you, Christophe. BTW playing activ-chess with given openings showing results around 20-30% against fritz, shredder or tiger. 1-2 years ago crafty was clother to the top under these conditions. Some results: (34) Crafty 18.10-18.11 : 530 (+204,=157,-169), 53.3 % Shredder 5.32 : 18 (+ 2,= 4,- 12), 22.2 % Hiarcs X 99 : 30 (+ 6,= 10,- 14), 36.7 % Crafty 17.14 : 20 (+ 7,= 4,- 9), 45.0 % Goliath Light 2B1.9c : 15 (+ 4,= 4,- 7), 40.0 % and against the newer one always between 25-33%: Chess Tiger 14.0 : 50 (+ 9,= 13,- 28), 31.0 % ! Junior 7.0 : 39 (+ 8,= 8,- 23), 30.8 % Shredder 5.32X : 15 (+ 3,= 4,- 8), 33.3 % Fritz6c : 19 (+ 3,= 5,- 11), 28.9 % Fritz Cadaques D900 : 20 (+ 3,= 4,- 13), 25.0 % Fritz 7 (No MMX) : 20 (+ 3,= 8,- 9), 35.0 % Fritz Cadaques : 20 (+ 3,= 5,- 12), 27.5 % Fritz 7 : 20 (+ 2,= 6,- 12), 25.0 % Deep Fritz : 44 (+ 8,= 13,- 23), 33.0 % But I also assume that 18.12 isn´t the strongest version... Chr.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.