Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Crafty 17-10 v Fritz 6a. Nunn 1.....1 min, 2 min, 3 min, 5 min

Author: Christophe Theron

Date: 21:42:47 04/23/00

Go up one level in this thread


On April 23, 2000 at 22:52:56, blass uri wrote:

>On April 23, 2000 at 16:16:46, Christophe Theron wrote:
>
>>On April 23, 2000 at 06:33:59, blass uri wrote:
>>
>>>On April 23, 2000 at 04:15:48, Christophe Theron wrote:
>>>
>>>>On April 23, 2000 at 00:43:49, Chessfun wrote:
>>>>
>>>>>On April 22, 2000 at 18:35:45, Christophe Theron wrote:
>>>>>
>>>>>>On April 22, 2000 at 13:13:00, Chessfun wrote:
>>>>>>
>>>>>>>
>>>>>>>Since I never got a reply on what those blitz times were on the
>>>>>>>previous thread, I played 1 min game, 2 min game, 3 min game and
>>>>>>>5 min game.
>>>>>>>
>>>>>>>I was surprized to read that Crafty 17-10 could beat Fritz 6a
>>>>>>>in Nunn 1 blitz as no previous version I had was close.
>>>>>>>
>>>>>>>All games on Cel 433.
>>>>>>>Anyone wanting the games email me.
>>>>>>>
>>>>>>>1 min game Fritz 6a 14.5 - 5.5 Crafty 17-10
>>>>>>>2 min game Fritz 6a 14.5 - 5.5 Crafty 17-10
>>>>>>>3 min game Fritz 6a 13.0 - 7.0 Crafty 17-10
>>>>>>>5 min game Fritz 6a 15.5 - 4.5 Crafty 17-10
>>>>>>>
>>>>>>>Thanks.
>>>>>>
>>>>>>
>>>>>>That's a very interesting experiment. Please, keep on playing with longer time
>>>>>>controls.
>>>>>>
>>>>>>I'm interested in knowing how programs behave when you change the time controls.
>>>>>>
>>>>>>Let's see if the result change drastically with much longer time controls.
>>>>>>
>>>>>>
>>>>>>    Christophe
>>>>>
>>>>>I have played 10 mins (it's posted now I'm trying 25 then will finish
>>>>>at 60 mins.
>>>>>
>>>>>The original intent was I did not believe the post:
>>>>>Sensation Crafty 17-10 beats F6 at nunn 1.
>>>>>
>>>>>Still don't believe it, don't believe the results or the
>>>>>games could ever be reporduced.
>>>>>
>>>>>The hype around the post was all of course it's natural,
>>>>>then when F6 wins it's, well, it's only blitz.
>>>>>
>>>>>Crafty is a fine program, but sometimes there is IMO
>>>>>a little bit too much of biased hype surrounding it's
>>>>>results.
>>>>>
>>>>>Thanks.
>>>>
>>>>
>>>>I bet that you'll get the same result (inside the mathematical error margin),
>>>>whatever time control you use.
>>>>
>>>>
>>>>
>>>>    Christophe
>>>
>>>I believe that if you do enough games you will not get the same result.
>>>
>>>It seems that Crafty is better at long time control.
>>>
>>>Based on blitz results you were right that Crafty is 100 elo weaker than the top
>>>programs but it seems based on the ssdf games that Crafty is not 100 elo weaker
>>>than the top programs at tournament time control inspite of the fact that crafty
>>>does not like the ssdf hardware and could earn more 20 elo rating if it used
>>>pentium instead of K6III-450.
>>>
>>>Crafty is only 97 elo weaker than top Fritz6a and Fritz6a is better than the
>>>average top program.
>>>
>>>Maybe the results of blitz games and tournament games do not prove with 95%
>>>confidence that crafty is better at long time control but I guess that it is
>>>only because of the fact that the ssdf do not have enough games.
>>>
>>>Uri
>>
>>
>>So you say that Crafty is better at long time controls, but pratically we will
>>never know because it is unlikely that enough games will be played to confirm
>>your statement.
>
>I did not say it.
>I said that maybe the results of blitz games do not prove with 95% confidence.
>
>1)I am not sure about it and maybe they do prove with 95% confidence that crafty
>is better at long time control.
>
>2)If they prove only with 80% confidence I do not ignore it.
>>
>>In this case I think it is much better to assume that Crafty is not better than
>>his opponents at long time controls.
>>
>>That's a simple matter of economy: when it's not necessary to introduce a new
>>rule to a model, just don't introduce any new rule.
>
>I do not think that it is logical to assume that different programs earn the
>same from time.
>
>I guess that for every 2 programs if you play enough games you can find that one
>program earns more from time.
>
>I think that if you test Crafty17.10 against hiarcs7.32 or tiger you will find
>bigger difference than the difference between  Crafty17.10 and Fritz6a so it is
>better
>to test Crafty17.10 against hiarcs7.32 or tiger at different time controls to
>prove that the thoery that different programs earn the same from time is wrong.
>>Introducing fancy new rules everywhere is in my opinion obscurantism.
>>
>>When you have evidence that the model is incomplete to describe what happens in
>>reality, add a rule.
>
>If we had no knowledge about chess programs you were right, but
>it is simply not logical to assume that programs with different evaluation
>function and different search rules earn the same from time.
>
>I think that some knowledge that you cannot detect by searching 1 or 2 plies
>deeper is more important at long time control.
>
>I think that crafty has some knowledge that you cannot detect by searching 1-2
>plies deeper that tiger does not have(I remember the case about KBPP vs K and I
>guess that it is not the only case) and if you want to improve tiger then
>looking at the evaluation function of crafty may be a good idea.
>
>Tiger has probably better search rules than crafty but I guess that the
>evaluation is more important at long time control and this is the reason that
>the difference between tiger and Crafty is smaller at long time control.
>
>In short time  control you have better chances to outsearch the opponent when in
>long time control if you do not build a good position then you will have no way
>to do it.
>
>This is my guess.
>
>Uri



I can defend opposite ideas (without saying I 100% believe in them, it's just to
explain why the situation is not so simple):


Searching deeper let's you understand some positional concepts, even if you do
not have them in your evaluation function.

It is also possible to think that a program that has more knowledge built in its
evaluation function will understand things at blitz, things that its opponent
will never understand (because it would understand only if it could search
deeper).

So I could argue that more accurate knowledge will be much more important in
blitz games.

But these arguments lead to nothing.


I think we can say that a given amount of knowledge (without taking search into
account) gives you X elo points. Then the search (actually the depth of the
search) gives you Y additional elo points.

For a given program, X is fixed and does not depend of the time control. Y
depends on the speed of the computer and the search time.

So what's left to discuss is "how much do you gain from the search when you
increase the search time".

Actually this is very closely related to what we call the "effective branching
factor".

A program with a lower EBF will get more points at longer time control than a
program with a high EBF.

If Crafty has a better EBF than Fritz, then you can expect Crafty to get better
with longer time controls than Fritz.

I think it would be possible to somehow measure the EBF of programs in the range
blitz to tournament time controls. Together with the results of blitz matches,
it would certainly be possible to get results very close to what the SSDF gets.
And in less time.

Incidentally, I think that most of the good chess programs have almost the same
EBF. I would call it the "state of the art EBF". That's what explains, IMO, why
blitz matches results are not completely irrelevant.

However I would not recommend blitz matches as the universal measure of playing
strength. Some competitor may find a trick to improve his EBF, and in this case
it would be harder to notice in blitz than in tournament games.

Also there is no guarantee that the EBF alone describes the variation of playing
strength with time. Some new kind of very effective extension scheme could
change that. At this time I think it is not the case, and I think EBF alone is
the most relevant information. But it can change in the future.

This is a short explanation of my current model for chess programs. This model
might not be useful for people who like to test chess programs, but it helps me
to improve my own program, so at least it is useful for me.

I also expect my model to change over time (I learn everyday). But I will need
evidence in order to change it.



    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.