Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Coparing two Identical Programs using Different Processors Speed !

Author: Christophe Theron

Date: 10:22:21 01/30/01

Go up one level in this thread


On January 29, 2001 at 14:49:26, Uri Blass wrote:

>On January 29, 2001 at 13:01:58, Christophe Theron wrote:
>
>>On January 28, 2001 at 19:10:39, Peter Berger wrote:
>>
>>>On January 28, 2001 at 17:00:30, Severi Salminen wrote:
>>>
>>>>>It is very strange. Chess programs are all using, more or less, the same basic
>>>>>principles. So the logical way is to assume that they all benefit more or less
>>>>>equally from faster hardware. But I have never seen anybody supporting this
>>>>>assumption. Instead of trying to demonstrate that this simple assumption is
>>>>>wrong, everybody just assumes that it is wrong. Why?
>>>>>
>>>>>I guess the answer is that it is more fun to assume that all chess programs do
>>>>>not benefit from faster hardware in the same way. So people believe that by
>>>>>changing the hardware or the time controls big surprises can happen...
>>>>>
>>>>>On the other hand it is always hard to explain that in short matches big
>>>>>surprises can happen FOR NO REASON.
>>>>>
>>>>>So people tend to draw flawed conclusions based mainly on their beliefs, and to
>>>>>present them as scientifical evidence...
>>>>
>>>>This all is result of human nature. We want to understand things we don't
>>>>understand. We want to create our own set of rules in order to forecast complex
>>>>systems. Same in computer chess: people love to see different characteristics in
>>>>different programs (Gambit Tiger is a brave attacker, Hiarcs plays positional
>>>>chess, Fritz tactical...). They want to see these "new paradigms" and want to
>>>>categorize programs' behaviour based on a few games. They want to see a
>>>>human-like behaviour. And it also looks like the people who make these
>>>>conclusions are usually not programmers (IMO :). And I don't blame them. It is
>>>>impossible to know how chess engines _really_ function unless you have tried it
>>>>out yourself. And for marketing point of view it would be quite boring if all
>>>>engines were presented as little modifications of same principles that have been
>>>>around 30 years, wouldn't it. I wouldn't be suprised if Fritz and Junior were
>>>>actually the same engine :)
>>>>
>>>>The point: let them have their paradigms and let us have our scientifical facts.
>>>>We can filter the useful inforamtion out. In this case maybe 500 games could not
>>>>be enough to show anything - if there is anything to show.
>>>>
>>>>Severi
>>>
>>>I tend to believe statistically significant results are overestimated : they are
>>>so easy to get : only takes _time_ : oops , might this be the reason they are
>>>that rare ?
>>>
>>>Look at Mr Heinz' results for the decreasing one-more-ply-effect : from a
>>>statistical point of view it is quite easy to question his results and require
>>>even more experiments to eliminate the "noise" , isn't it ?
>>>
>>>I suspect it is quite easy to prove that certain programs profit more from
>>>better hardware than others : these Nimzo tests are a good start btw : to
>>>question the reliability of these results is perfectly OK for sure : but they
>>>point into a certain direction ; statistics is simple and difficult at the same
>>>time ; what some people seem to forget : even if you play a too little number of
>>>games you can place a bet which is better than 50 % , a thing people do all day
>>>IRL ; I suspect with this Nimzo data we are already way over 60 % btw ; might
>>>still  be all nonsense for sure ...
>>>
>>>The tools are there and it is tempting to simply do it to end this "battle" . To
>>>avoid the question " Is it better hardware or does program X simply suck at
>>>blitz" it is probably better to choose fast time control , then something like
>>>ERT , 500 games each , time control maybe 5 minutes game /3 secs increment ;
>>>opponents maybe a Tiger or Crafty against a Gandalf or a Nimzo on a fast and a
>>>slow compi ; but statistics is tricky , else this would probably already simply
>>>have been done ;
>>
>>
>>
>>No, it's easy to do. Nothing tricky here. All you need is the hardware (many
>>people have it) and a little time (maybe one week of computer time).
>
>I believe that top programs do not earn the same from time(It is  simply sound
>not logical for me to believe that all programs are the same but the problem is
>that the difference is small).


This time I think we are getting close to an agreement, Uri.

I'm ready to concede that there might be differences, but they are indeed
impossible to detect unless you conduct a very time consuming test.



>I think that you need more than one week to get a significant result.
>The problem is that the difference is so small that some hundreds of games for
>every program are not enough to get a significant result and I  guess that you
>may need 10000 games for every program at blitz and at tournament time control
>in order to compare.
>
>You need clearly more than one week to get 10000 games for every program in the
>ssdf list.
>
>It can be done by getting more testers but we need sponsors for it.


Calm down with money, Uri! ;)

There are people here who like to test chess programs. I think that we could
begin with a 200 games match. 200 blitz games, and 200 game at 1h per game would
already tell us something. I believe the test can be done in approximately 20
days.



    Christophe



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.