Author: Christophe Theron
Date: 11:39:50 04/24/00
Go up one level in this thread
On April 24, 2000 at 14:00:42, Andrew Dados wrote: >On April 24, 2000 at 09:52:48, Robert Hyatt wrote: > >>On April 24, 2000 at 00:42:47, Christophe Theron wrote: >> >>>On April 23, 2000 at 22:52:56, blass uri wrote: >>> >>>>On April 23, 2000 at 16:16:46, Christophe Theron wrote: >>>> >>>>>On April 23, 2000 at 06:33:59, blass uri wrote: >>>>> >>>>>>On April 23, 2000 at 04:15:48, Christophe Theron wrote: >>>>>> >>>>>>>On April 23, 2000 at 00:43:49, Chessfun wrote: >>>>>>> >>>>>>>>On April 22, 2000 at 18:35:45, Christophe Theron wrote: >>>>>>>> >>>>>>>>>On April 22, 2000 at 13:13:00, Chessfun wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>>Since I never got a reply on what those blitz times were on the >>>>>>>>>>previous thread, I played 1 min game, 2 min game, 3 min game and >>>>>>>>>>5 min game. >>>>>>>>>> >>>>>>>>>>I was surprized to read that Crafty 17-10 could beat Fritz 6a >>>>>>>>>>in Nunn 1 blitz as no previous version I had was close. >>>>>>>>>> >>>>>>>>>>All games on Cel 433. >>>>>>>>>>Anyone wanting the games email me. >>>>>>>>>> >>>>>>>>>>1 min game Fritz 6a 14.5 - 5.5 Crafty 17-10 >>>>>>>>>>2 min game Fritz 6a 14.5 - 5.5 Crafty 17-10 >>>>>>>>>>3 min game Fritz 6a 13.0 - 7.0 Crafty 17-10 >>>>>>>>>>5 min game Fritz 6a 15.5 - 4.5 Crafty 17-10 >>>>>>>>>> >>>>>>>>>>Thanks. >>>>>>>>> >>>>>>>>> >>>>>>>>>That's a very interesting experiment. Please, keep on playing with longer time >>>>>>>>>controls. >>>>>>>>> >>>>>>>>>I'm interested in knowing how programs behave when you change the time controls. >>>>>>>>> >>>>>>>>>Let's see if the result change drastically with much longer time controls. >>>>>>>>> >>>>>>>>> >>>>>>>>> Christophe >>>>>>>> >>>>>>>>I have played 10 mins (it's posted now I'm trying 25 then will finish >>>>>>>>at 60 mins. >>>>>>>> >>>>>>>>The original intent was I did not believe the post: >>>>>>>>Sensation Crafty 17-10 beats F6 at nunn 1. >>>>>>>> >>>>>>>>Still don't believe it, don't believe the results or the >>>>>>>>games could ever be reporduced. >>>>>>>> >>>>>>>>The hype around the post was all of course it's natural, >>>>>>>>then when F6 wins it's, well, it's only blitz. >>>>>>>> >>>>>>>>Crafty is a fine program, but sometimes there is IMO >>>>>>>>a little bit too much of biased hype surrounding it's >>>>>>>>results. >>>>>>>> >>>>>>>>Thanks. >>>>>>> >>>>>>> >>>>>>>I bet that you'll get the same result (inside the mathematical error margin), >>>>>>>whatever time control you use. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Christophe >>>>>> >>>>>>I believe that if you do enough games you will not get the same result. >>>>>> >>>>>>It seems that Crafty is better at long time control. >>>>>> >>>>>>Based on blitz results you were right that Crafty is 100 elo weaker than the top >>>>>>programs but it seems based on the ssdf games that Crafty is not 100 elo weaker >>>>>>than the top programs at tournament time control inspite of the fact that crafty >>>>>>does not like the ssdf hardware and could earn more 20 elo rating if it used >>>>>>pentium instead of K6III-450. >>>>>> >>>>>>Crafty is only 97 elo weaker than top Fritz6a and Fritz6a is better than the >>>>>>average top program. >>>>>> >>>>>>Maybe the results of blitz games and tournament games do not prove with 95% >>>>>>confidence that crafty is better at long time control but I guess that it is >>>>>>only because of the fact that the ssdf do not have enough games. >>>>>> >>>>>>Uri >>>>> >>>>> >>>>>So you say that Crafty is better at long time controls, but pratically we will >>>>>never know because it is unlikely that enough games will be played to confirm >>>>>your statement. >>>> >>>>I did not say it. >>>>I said that maybe the results of blitz games do not prove with 95% confidence. >>>> >>>>1)I am not sure about it and maybe they do prove with 95% confidence that crafty >>>>is better at long time control. >>>> >>>>2)If they prove only with 80% confidence I do not ignore it. >>>>> >>>>>In this case I think it is much better to assume that Crafty is not better than >>>>>his opponents at long time controls. >>>>> >>>>>That's a simple matter of economy: when it's not necessary to introduce a new >>>>>rule to a model, just don't introduce any new rule. >>>> >>>>I do not think that it is logical to assume that different programs earn the >>>>same from time. >>>> >>>>I guess that for every 2 programs if you play enough games you can find that one >>>>program earns more from time. >>>> >>>>I think that if you test Crafty17.10 against hiarcs7.32 or tiger you will find >>>>bigger difference than the difference between Crafty17.10 and Fritz6a so it is >>>>better >>>>to test Crafty17.10 against hiarcs7.32 or tiger at different time controls to >>>>prove that the thoery that different programs earn the same from time is wrong. >>>>>Introducing fancy new rules everywhere is in my opinion obscurantism. >>>>> >>>>>When you have evidence that the model is incomplete to describe what happens in >>>>>reality, add a rule. >>>> >>>>If we had no knowledge about chess programs you were right, but >>>>it is simply not logical to assume that programs with different evaluation >>>>function and different search rules earn the same from time. >>>> >>>>I think that some knowledge that you cannot detect by searching 1 or 2 plies >>>>deeper is more important at long time control. >>>> >>>>I think that crafty has some knowledge that you cannot detect by searching 1-2 >>>>plies deeper that tiger does not have(I remember the case about KBPP vs K and I >>>>guess that it is not the only case) and if you want to improve tiger then >>>>looking at the evaluation function of crafty may be a good idea. >>>> >>>>Tiger has probably better search rules than crafty but I guess that the >>>>evaluation is more important at long time control and this is the reason that >>>>the difference between tiger and Crafty is smaller at long time control. >>>> >>>>In short time control you have better chances to outsearch the opponent when in >>>>long time control if you do not build a good position then you will have no way >>>>to do it. >>>> >>>>This is my guess. >>>> >>>>Uri >>> >>> >>> >>>I can defend opposite ideas (without saying I 100% believe in them, it's just to >>>explain why the situation is not so simple): >>> >>> >>>Searching deeper let's you understand some positional concepts, even if you do >>>not have them in your evaluation function. >>> >>>It is also possible to think that a program that has more knowledge built in its >>>evaluation function will understand things at blitz, things that its opponent >>>will never understand (because it would understand only if it could search >>>deeper). >>> >>>So I could argue that more accurate knowledge will be much more important in >>>blitz games. >>> >>>But these arguments lead to nothing. >>> >>> >>>I think we can say that a given amount of knowledge (without taking search into >>>account) gives you X elo points. Then the search (actually the depth of the >>>search) gives you Y additional elo points. >>> >>>For a given program, X is fixed and does not depend of the time control. Y >>>depends on the speed of the computer and the search time. >>> >>>So what's left to discuss is "how much do you gain from the search when you >>>increase the search time". >>> >>>Actually this is very closely related to what we call the "effective branching >>>factor". >>> >>>A program with a lower EBF will get more points at longer time control than a >>>program with a high EBF. >>> >>>If Crafty has a better EBF than Fritz, then you can expect Crafty to get better >>>with longer time controls than Fritz. >>> >>>I think it would be possible to somehow measure the EBF of programs in the range >>>blitz to tournament time controls. Together with the results of blitz matches, >>>it would certainly be possible to get results very close to what the SSDF gets. >>>And in less time. >>> >>>Incidentally, I think that most of the good chess programs have almost the same >>>EBF. I would call it the "state of the art EBF". That's what explains, IMO, why >>>blitz matches results are not completely irrelevant. >> >> >>I don't consider EBF the issue. One good example was a test I ran many years >>ago between Cray Blitz and Genius. Due to machine time limits, I started off >>playing some _very_ fast games. It became quickly obvious that Genius was >>not going to be able to even draw a game, much less win one. I continued to >>lengthen the time control until at least the games started to become something >>like 'chess'. The problem (at the time) for genius was that Cray Blitz had a >>very tactically tuned search, using singular extensions and some other things >>we were doing. And every game was tactically over by move 30 where we were >>ahead by enough material to make the win easy. As the depth increased, our >>tactical superiority began to erode as both programs were going deeper, but >>genius was 'catching up' with enough depth to begin to avoid the simple >>tactical blunders it was making at the 1-5 seconds/move level. >> >>I think a program with a 'so-so' search (Crafty, as I have not yet tried to >>finish the singular extension code and so forth), but a reasonable evaluation, >>will have more trouble at fast time controls where the difference between an >>effective 10 ply search and an effective 8 ply search is much more important >>than the difference between an effective 16 ply search and an effective 14 ply >>search. > > > Does it sound like 'diminishing returns' from extra plies? I thought there was >no evidence of that. Exactly! > However pruning tricks like null move and others may >require some 'minimal search depth' to avoid most tactical blunders. And it may >be the case here.... > >-Andrew-
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.