Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Crafty 17-10 v Fritz 6a. Nunn 1.....1 min, 2 min, 3 min, 5 min

Author: Christophe Theron

Date: 11:39:50 04/24/00

Go up one level in this thread


On April 24, 2000 at 14:00:42, Andrew Dados wrote:

>On April 24, 2000 at 09:52:48, Robert Hyatt wrote:
>
>>On April 24, 2000 at 00:42:47, Christophe Theron wrote:
>>
>>>On April 23, 2000 at 22:52:56, blass uri wrote:
>>>
>>>>On April 23, 2000 at 16:16:46, Christophe Theron wrote:
>>>>
>>>>>On April 23, 2000 at 06:33:59, blass uri wrote:
>>>>>
>>>>>>On April 23, 2000 at 04:15:48, Christophe Theron wrote:
>>>>>>
>>>>>>>On April 23, 2000 at 00:43:49, Chessfun wrote:
>>>>>>>
>>>>>>>>On April 22, 2000 at 18:35:45, Christophe Theron wrote:
>>>>>>>>
>>>>>>>>>On April 22, 2000 at 13:13:00, Chessfun wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Since I never got a reply on what those blitz times were on the
>>>>>>>>>>previous thread, I played 1 min game, 2 min game, 3 min game and
>>>>>>>>>>5 min game.
>>>>>>>>>>
>>>>>>>>>>I was surprized to read that Crafty 17-10 could beat Fritz 6a
>>>>>>>>>>in Nunn 1 blitz as no previous version I had was close.
>>>>>>>>>>
>>>>>>>>>>All games on Cel 433.
>>>>>>>>>>Anyone wanting the games email me.
>>>>>>>>>>
>>>>>>>>>>1 min game Fritz 6a 14.5 - 5.5 Crafty 17-10
>>>>>>>>>>2 min game Fritz 6a 14.5 - 5.5 Crafty 17-10
>>>>>>>>>>3 min game Fritz 6a 13.0 - 7.0 Crafty 17-10
>>>>>>>>>>5 min game Fritz 6a 15.5 - 4.5 Crafty 17-10
>>>>>>>>>>
>>>>>>>>>>Thanks.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>That's a very interesting experiment. Please, keep on playing with longer time
>>>>>>>>>controls.
>>>>>>>>>
>>>>>>>>>I'm interested in knowing how programs behave when you change the time controls.
>>>>>>>>>
>>>>>>>>>Let's see if the result change drastically with much longer time controls.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    Christophe
>>>>>>>>
>>>>>>>>I have played 10 mins (it's posted now I'm trying 25 then will finish
>>>>>>>>at 60 mins.
>>>>>>>>
>>>>>>>>The original intent was I did not believe the post:
>>>>>>>>Sensation Crafty 17-10 beats F6 at nunn 1.
>>>>>>>>
>>>>>>>>Still don't believe it, don't believe the results or the
>>>>>>>>games could ever be reporduced.
>>>>>>>>
>>>>>>>>The hype around the post was all of course it's natural,
>>>>>>>>then when F6 wins it's, well, it's only blitz.
>>>>>>>>
>>>>>>>>Crafty is a fine program, but sometimes there is IMO
>>>>>>>>a little bit too much of biased hype surrounding it's
>>>>>>>>results.
>>>>>>>>
>>>>>>>>Thanks.
>>>>>>>
>>>>>>>
>>>>>>>I bet that you'll get the same result (inside the mathematical error margin),
>>>>>>>whatever time control you use.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>    Christophe
>>>>>>
>>>>>>I believe that if you do enough games you will not get the same result.
>>>>>>
>>>>>>It seems that Crafty is better at long time control.
>>>>>>
>>>>>>Based on blitz results you were right that Crafty is 100 elo weaker than the top
>>>>>>programs but it seems based on the ssdf games that Crafty is not 100 elo weaker
>>>>>>than the top programs at tournament time control inspite of the fact that crafty
>>>>>>does not like the ssdf hardware and could earn more 20 elo rating if it used
>>>>>>pentium instead of K6III-450.
>>>>>>
>>>>>>Crafty is only 97 elo weaker than top Fritz6a and Fritz6a is better than the
>>>>>>average top program.
>>>>>>
>>>>>>Maybe the results of blitz games and tournament games do not prove with 95%
>>>>>>confidence that crafty is better at long time control but I guess that it is
>>>>>>only because of the fact that the ssdf do not have enough games.
>>>>>>
>>>>>>Uri
>>>>>
>>>>>
>>>>>So you say that Crafty is better at long time controls, but pratically we will
>>>>>never know because it is unlikely that enough games will be played to confirm
>>>>>your statement.
>>>>
>>>>I did not say it.
>>>>I said that maybe the results of blitz games do not prove with 95% confidence.
>>>>
>>>>1)I am not sure about it and maybe they do prove with 95% confidence that crafty
>>>>is better at long time control.
>>>>
>>>>2)If they prove only with 80% confidence I do not ignore it.
>>>>>
>>>>>In this case I think it is much better to assume that Crafty is not better than
>>>>>his opponents at long time controls.
>>>>>
>>>>>That's a simple matter of economy: when it's not necessary to introduce a new
>>>>>rule to a model, just don't introduce any new rule.
>>>>
>>>>I do not think that it is logical to assume that different programs earn the
>>>>same from time.
>>>>
>>>>I guess that for every 2 programs if you play enough games you can find that one
>>>>program earns more from time.
>>>>
>>>>I think that if you test Crafty17.10 against hiarcs7.32 or tiger you will find
>>>>bigger difference than the difference between  Crafty17.10 and Fritz6a so it is
>>>>better
>>>>to test Crafty17.10 against hiarcs7.32 or tiger at different time controls to
>>>>prove that the thoery that different programs earn the same from time is wrong.
>>>>>Introducing fancy new rules everywhere is in my opinion obscurantism.
>>>>>
>>>>>When you have evidence that the model is incomplete to describe what happens in
>>>>>reality, add a rule.
>>>>
>>>>If we had no knowledge about chess programs you were right, but
>>>>it is simply not logical to assume that programs with different evaluation
>>>>function and different search rules earn the same from time.
>>>>
>>>>I think that some knowledge that you cannot detect by searching 1 or 2 plies
>>>>deeper is more important at long time control.
>>>>
>>>>I think that crafty has some knowledge that you cannot detect by searching 1-2
>>>>plies deeper that tiger does not have(I remember the case about KBPP vs K and I
>>>>guess that it is not the only case) and if you want to improve tiger then
>>>>looking at the evaluation function of crafty may be a good idea.
>>>>
>>>>Tiger has probably better search rules than crafty but I guess that the
>>>>evaluation is more important at long time control and this is the reason that
>>>>the difference between tiger and Crafty is smaller at long time control.
>>>>
>>>>In short time  control you have better chances to outsearch the opponent when in
>>>>long time control if you do not build a good position then you will have no way
>>>>to do it.
>>>>
>>>>This is my guess.
>>>>
>>>>Uri
>>>
>>>
>>>
>>>I can defend opposite ideas (without saying I 100% believe in them, it's just to
>>>explain why the situation is not so simple):
>>>
>>>
>>>Searching deeper let's you understand some positional concepts, even if you do
>>>not have them in your evaluation function.
>>>
>>>It is also possible to think that a program that has more knowledge built in its
>>>evaluation function will understand things at blitz, things that its opponent
>>>will never understand (because it would understand only if it could search
>>>deeper).
>>>
>>>So I could argue that more accurate knowledge will be much more important in
>>>blitz games.
>>>
>>>But these arguments lead to nothing.
>>>
>>>
>>>I think we can say that a given amount of knowledge (without taking search into
>>>account) gives you X elo points. Then the search (actually the depth of the
>>>search) gives you Y additional elo points.
>>>
>>>For a given program, X is fixed and does not depend of the time control. Y
>>>depends on the speed of the computer and the search time.
>>>
>>>So what's left to discuss is "how much do you gain from the search when you
>>>increase the search time".
>>>
>>>Actually this is very closely related to what we call the "effective branching
>>>factor".
>>>
>>>A program with a lower EBF will get more points at longer time control than a
>>>program with a high EBF.
>>>
>>>If Crafty has a better EBF than Fritz, then you can expect Crafty to get better
>>>with longer time controls than Fritz.
>>>
>>>I think it would be possible to somehow measure the EBF of programs in the range
>>>blitz to tournament time controls. Together with the results of blitz matches,
>>>it would certainly be possible to get results very close to what the SSDF gets.
>>>And in less time.
>>>
>>>Incidentally, I think that most of the good chess programs have almost the same
>>>EBF. I would call it the "state of the art EBF". That's what explains, IMO, why
>>>blitz matches results are not completely irrelevant.
>>
>>
>>I don't consider EBF the issue.  One good example was a test I ran many years
>>ago between Cray Blitz and Genius.  Due to machine time limits, I started off
>>playing some _very_ fast games.  It became quickly obvious that Genius was
>>not going to be able to even draw a game, much less win one.  I continued to
>>lengthen the time control until at least the games started to become something
>>like 'chess'.  The problem (at the time) for genius was that Cray Blitz had a
>>very tactically tuned search, using singular extensions and some other things
>>we were doing.  And every game was tactically over by move 30 where we were
>>ahead by enough material to make the win easy.  As the depth increased, our
>>tactical superiority began to erode as both programs were going deeper, but
>>genius was 'catching up' with enough depth to begin to avoid the simple
>>tactical blunders it was making at the 1-5 seconds/move level.
>>
>>I think a program with a 'so-so' search (Crafty, as I have not yet tried to
>>finish the singular extension code and so forth), but a reasonable evaluation,
>>will have more trouble at fast time controls where the difference between an
>>effective 10 ply search and an effective 8 ply search is much more important
>>than the difference between an effective 16 ply search and an effective 14 ply
>>search.
>
>
> Does it sound like 'diminishing returns' from extra plies? I thought there was
>no evidence of that.



Exactly!





> However pruning tricks like null move and others may
>require some 'minimal search depth' to avoid most tactical blunders. And it may
>be the case here....
>
>-Andrew-



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.