Author: Didzis Cirulis
Date: 11:40:01 01/30/01
Go up one level in this thread
On January 30, 2001 at 13:22:21, Christophe Theron wrote: >On January 29, 2001 at 14:49:26, Uri Blass wrote: > >>On January 29, 2001 at 13:01:58, Christophe Theron wrote: >> >>>On January 28, 2001 at 19:10:39, Peter Berger wrote: >>> >>>>On January 28, 2001 at 17:00:30, Severi Salminen wrote: >>>> >>>>>>It is very strange. Chess programs are all using, more or less, the same basic >>>>>>principles. So the logical way is to assume that they all benefit more or less >>>>>>equally from faster hardware. But I have never seen anybody supporting this >>>>>>assumption. Instead of trying to demonstrate that this simple assumption is >>>>>>wrong, everybody just assumes that it is wrong. Why? >>>>>> >>>>>>I guess the answer is that it is more fun to assume that all chess programs do >>>>>>not benefit from faster hardware in the same way. So people believe that by >>>>>>changing the hardware or the time controls big surprises can happen... >>>>>> >>>>>>On the other hand it is always hard to explain that in short matches big >>>>>>surprises can happen FOR NO REASON. >>>>>> >>>>>>So people tend to draw flawed conclusions based mainly on their beliefs, and to >>>>>>present them as scientifical evidence... >>>>> >>>>>This all is result of human nature. We want to understand things we don't >>>>>understand. We want to create our own set of rules in order to forecast complex >>>>>systems. Same in computer chess: people love to see different characteristics in >>>>>different programs (Gambit Tiger is a brave attacker, Hiarcs plays positional >>>>>chess, Fritz tactical...). They want to see these "new paradigms" and want to >>>>>categorize programs' behaviour based on a few games. They want to see a >>>>>human-like behaviour. And it also looks like the people who make these >>>>>conclusions are usually not programmers (IMO :). And I don't blame them. It is >>>>>impossible to know how chess engines _really_ function unless you have tried it >>>>>out yourself. And for marketing point of view it would be quite boring if all >>>>>engines were presented as little modifications of same principles that have been >>>>>around 30 years, wouldn't it. I wouldn't be suprised if Fritz and Junior were >>>>>actually the same engine :) >>>>> >>>>>The point: let them have their paradigms and let us have our scientifical facts. >>>>>We can filter the useful inforamtion out. In this case maybe 500 games could not >>>>>be enough to show anything - if there is anything to show. >>>>> >>>>>Severi >>>> >>>>I tend to believe statistically significant results are overestimated : they are >>>>so easy to get : only takes _time_ : oops , might this be the reason they are >>>>that rare ? >>>> >>>>Look at Mr Heinz' results for the decreasing one-more-ply-effect : from a >>>>statistical point of view it is quite easy to question his results and require >>>>even more experiments to eliminate the "noise" , isn't it ? >>>> >>>>I suspect it is quite easy to prove that certain programs profit more from >>>>better hardware than others : these Nimzo tests are a good start btw : to >>>>question the reliability of these results is perfectly OK for sure : but they >>>>point into a certain direction ; statistics is simple and difficult at the same >>>>time ; what some people seem to forget : even if you play a too little number of >>>>games you can place a bet which is better than 50 % , a thing people do all day >>>>IRL ; I suspect with this Nimzo data we are already way over 60 % btw ; might >>>>still be all nonsense for sure ... >>>> >>>>The tools are there and it is tempting to simply do it to end this "battle" . To >>>>avoid the question " Is it better hardware or does program X simply suck at >>>>blitz" it is probably better to choose fast time control , then something like >>>>ERT , 500 games each , time control maybe 5 minutes game /3 secs increment ; >>>>opponents maybe a Tiger or Crafty against a Gandalf or a Nimzo on a fast and a >>>>slow compi ; but statistics is tricky , else this would probably already simply >>>>have been done ; >>> >>> >>> >>>No, it's easy to do. Nothing tricky here. All you need is the hardware (many >>>people have it) and a little time (maybe one week of computer time). >> >>I believe that top programs do not earn the same from time(It is simply sound >>not logical for me to believe that all programs are the same but the problem is >>that the difference is small). > > >This time I think we are getting close to an agreement, Uri. > >I'm ready to concede that there might be differences, but they are indeed >impossible to detect unless you conduct a very time consuming test. > > > >>I think that you need more than one week to get a significant result. >>The problem is that the difference is so small that some hundreds of games for >>every program are not enough to get a significant result and I guess that you >>may need 10000 games for every program at blitz and at tournament time control >>in order to compare. >> >>You need clearly more than one week to get 10000 games for every program in the >>ssdf list. >> >>It can be done by getting more testers but we need sponsors for it. > > >Calm down with money, Uri! ;) > >There are people here who like to test chess programs. I think that we could >begin with a 200 games match. 200 blitz games, and 200 game at 1h per game would >already tell us something. I believe the test can be done in approximately 20 >days. > > > > Christophe The question of coordination... I mean, is there anybody here (CCC) who is ready to keep all incoming data and coordinate the testing? Something like SSDF of CCC. A good (and real) idea! Didzis Cirulis
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.