Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Crafty 17-10 v Fritz 6a. Nunn 1.....1 min, 2 min, 3 min, 5 min

Author: Robert Hyatt

Date: 06:52:48 04/24/00

Go up one level in this thread


On April 24, 2000 at 00:42:47, Christophe Theron wrote:

>On April 23, 2000 at 22:52:56, blass uri wrote:
>
>>On April 23, 2000 at 16:16:46, Christophe Theron wrote:
>>
>>>On April 23, 2000 at 06:33:59, blass uri wrote:
>>>
>>>>On April 23, 2000 at 04:15:48, Christophe Theron wrote:
>>>>
>>>>>On April 23, 2000 at 00:43:49, Chessfun wrote:
>>>>>
>>>>>>On April 22, 2000 at 18:35:45, Christophe Theron wrote:
>>>>>>
>>>>>>>On April 22, 2000 at 13:13:00, Chessfun wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>Since I never got a reply on what those blitz times were on the
>>>>>>>>previous thread, I played 1 min game, 2 min game, 3 min game and
>>>>>>>>5 min game.
>>>>>>>>
>>>>>>>>I was surprized to read that Crafty 17-10 could beat Fritz 6a
>>>>>>>>in Nunn 1 blitz as no previous version I had was close.
>>>>>>>>
>>>>>>>>All games on Cel 433.
>>>>>>>>Anyone wanting the games email me.
>>>>>>>>
>>>>>>>>1 min game Fritz 6a 14.5 - 5.5 Crafty 17-10
>>>>>>>>2 min game Fritz 6a 14.5 - 5.5 Crafty 17-10
>>>>>>>>3 min game Fritz 6a 13.0 - 7.0 Crafty 17-10
>>>>>>>>5 min game Fritz 6a 15.5 - 4.5 Crafty 17-10
>>>>>>>>
>>>>>>>>Thanks.
>>>>>>>
>>>>>>>
>>>>>>>That's a very interesting experiment. Please, keep on playing with longer time
>>>>>>>controls.
>>>>>>>
>>>>>>>I'm interested in knowing how programs behave when you change the time controls.
>>>>>>>
>>>>>>>Let's see if the result change drastically with much longer time controls.
>>>>>>>
>>>>>>>
>>>>>>>    Christophe
>>>>>>
>>>>>>I have played 10 mins (it's posted now I'm trying 25 then will finish
>>>>>>at 60 mins.
>>>>>>
>>>>>>The original intent was I did not believe the post:
>>>>>>Sensation Crafty 17-10 beats F6 at nunn 1.
>>>>>>
>>>>>>Still don't believe it, don't believe the results or the
>>>>>>games could ever be reporduced.
>>>>>>
>>>>>>The hype around the post was all of course it's natural,
>>>>>>then when F6 wins it's, well, it's only blitz.
>>>>>>
>>>>>>Crafty is a fine program, but sometimes there is IMO
>>>>>>a little bit too much of biased hype surrounding it's
>>>>>>results.
>>>>>>
>>>>>>Thanks.
>>>>>
>>>>>
>>>>>I bet that you'll get the same result (inside the mathematical error margin),
>>>>>whatever time control you use.
>>>>>
>>>>>
>>>>>
>>>>>    Christophe
>>>>
>>>>I believe that if you do enough games you will not get the same result.
>>>>
>>>>It seems that Crafty is better at long time control.
>>>>
>>>>Based on blitz results you were right that Crafty is 100 elo weaker than the top
>>>>programs but it seems based on the ssdf games that Crafty is not 100 elo weaker
>>>>than the top programs at tournament time control inspite of the fact that crafty
>>>>does not like the ssdf hardware and could earn more 20 elo rating if it used
>>>>pentium instead of K6III-450.
>>>>
>>>>Crafty is only 97 elo weaker than top Fritz6a and Fritz6a is better than the
>>>>average top program.
>>>>
>>>>Maybe the results of blitz games and tournament games do not prove with 95%
>>>>confidence that crafty is better at long time control but I guess that it is
>>>>only because of the fact that the ssdf do not have enough games.
>>>>
>>>>Uri
>>>
>>>
>>>So you say that Crafty is better at long time controls, but pratically we will
>>>never know because it is unlikely that enough games will be played to confirm
>>>your statement.
>>
>>I did not say it.
>>I said that maybe the results of blitz games do not prove with 95% confidence.
>>
>>1)I am not sure about it and maybe they do prove with 95% confidence that crafty
>>is better at long time control.
>>
>>2)If they prove only with 80% confidence I do not ignore it.
>>>
>>>In this case I think it is much better to assume that Crafty is not better than
>>>his opponents at long time controls.
>>>
>>>That's a simple matter of economy: when it's not necessary to introduce a new
>>>rule to a model, just don't introduce any new rule.
>>
>>I do not think that it is logical to assume that different programs earn the
>>same from time.
>>
>>I guess that for every 2 programs if you play enough games you can find that one
>>program earns more from time.
>>
>>I think that if you test Crafty17.10 against hiarcs7.32 or tiger you will find
>>bigger difference than the difference between  Crafty17.10 and Fritz6a so it is
>>better
>>to test Crafty17.10 against hiarcs7.32 or tiger at different time controls to
>>prove that the thoery that different programs earn the same from time is wrong.
>>>Introducing fancy new rules everywhere is in my opinion obscurantism.
>>>
>>>When you have evidence that the model is incomplete to describe what happens in
>>>reality, add a rule.
>>
>>If we had no knowledge about chess programs you were right, but
>>it is simply not logical to assume that programs with different evaluation
>>function and different search rules earn the same from time.
>>
>>I think that some knowledge that you cannot detect by searching 1 or 2 plies
>>deeper is more important at long time control.
>>
>>I think that crafty has some knowledge that you cannot detect by searching 1-2
>>plies deeper that tiger does not have(I remember the case about KBPP vs K and I
>>guess that it is not the only case) and if you want to improve tiger then
>>looking at the evaluation function of crafty may be a good idea.
>>
>>Tiger has probably better search rules than crafty but I guess that the
>>evaluation is more important at long time control and this is the reason that
>>the difference between tiger and Crafty is smaller at long time control.
>>
>>In short time  control you have better chances to outsearch the opponent when in
>>long time control if you do not build a good position then you will have no way
>>to do it.
>>
>>This is my guess.
>>
>>Uri
>
>
>
>I can defend opposite ideas (without saying I 100% believe in them, it's just to
>explain why the situation is not so simple):
>
>
>Searching deeper let's you understand some positional concepts, even if you do
>not have them in your evaluation function.
>
>It is also possible to think that a program that has more knowledge built in its
>evaluation function will understand things at blitz, things that its opponent
>will never understand (because it would understand only if it could search
>deeper).
>
>So I could argue that more accurate knowledge will be much more important in
>blitz games.
>
>But these arguments lead to nothing.
>
>
>I think we can say that a given amount of knowledge (without taking search into
>account) gives you X elo points. Then the search (actually the depth of the
>search) gives you Y additional elo points.
>
>For a given program, X is fixed and does not depend of the time control. Y
>depends on the speed of the computer and the search time.
>
>So what's left to discuss is "how much do you gain from the search when you
>increase the search time".
>
>Actually this is very closely related to what we call the "effective branching
>factor".
>
>A program with a lower EBF will get more points at longer time control than a
>program with a high EBF.
>
>If Crafty has a better EBF than Fritz, then you can expect Crafty to get better
>with longer time controls than Fritz.
>
>I think it would be possible to somehow measure the EBF of programs in the range
>blitz to tournament time controls. Together with the results of blitz matches,
>it would certainly be possible to get results very close to what the SSDF gets.
>And in less time.
>
>Incidentally, I think that most of the good chess programs have almost the same
>EBF. I would call it the "state of the art EBF". That's what explains, IMO, why
>blitz matches results are not completely irrelevant.


I don't consider EBF the issue.  One good example was a test I ran many years
ago between Cray Blitz and Genius.  Due to machine time limits, I started off
playing some _very_ fast games.  It became quickly obvious that Genius was
not going to be able to even draw a game, much less win one.  I continued to
lengthen the time control until at least the games started to become something
like 'chess'.  The problem (at the time) for genius was that Cray Blitz had a
very tactically tuned search, using singular extensions and some other things
we were doing.  And every game was tactically over by move 30 where we were
ahead by enough material to make the win easy.  As the depth increased, our
tactical superiority began to erode as both programs were going deeper, but
genius was 'catching up' with enough depth to begin to avoid the simple
tactical blunders it was making at the 1-5 seconds/move level.

I think a program with a 'so-so' search (Crafty, as I have not yet tried to
finish the singular extension code and so forth), but a reasonable evaluation,
will have more trouble at fast time controls where the difference between an
effective 10 ply search and an effective 8 ply search is much more important
than the difference between an effective 16 ply search and an effective 14 ply
search.




>

>However I would not recommend blitz matches as the universal measure of playing
>strength. Some competitor may find a trick to improve his EBF, and in this case
>it would be harder to notice in blitz than in tournament games.
>
>Also there is no guarantee that the EBF alone describes the variation of playing
>strength with time. Some new kind of very effective extension scheme could
>change that. At this time I think it is not the case, and I think EBF alone is
>the most relevant information. But it can change in the future.

I can think of more than one program that seems to have extensions tuned for
blitz.  At longer time controls, they 'bog down' badly.  This can be a serious
problem...





>
>This is a short explanation of my current model for chess programs. This model
>might not be useful for people who like to test chess programs, but it helps me
>to improve my own program, so at least it is useful for me.
>
>I also expect my model to change over time (I learn everyday). But I will need
>evidence in order to change it.
>
>
>
>    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.