Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Chessfun and Nunn1 Tests

Author: blass uri

Date: 04:33:35 05/07/00

Go up one level in this thread


On May 07, 2000 at 06:25:36, Mogens Larsen wrote:

>On May 07, 2000 at 05:42:34, blass uri wrote:
>
>>She found also that Crafty was slowed down in the games that fritz was leading
>>9-0.
>
>That doesn't exactly confirm the validity of results. There's no reason to
>assume that they cancel each other out. The uncertainty doesn't diminish, it
>grows.

I do not think that the fact that she found problems increase the uncertainty.
If nobody knows about problems it does not reduce the problems.

The fact that she searched to see if there are problems reduce the uncertainty.


>
>>I do not think that she wanted to prove that Fritz is better than crafty.
>>She did not know before her games that fritz will win.
>
>Well, that would be a question of semantics. If you go through Chessfuns
>postings regarding this issue, she's convinced that Fritz will win. That's okay,
>but that would also be a fair guess without further testing.

She was convinced that Fritz will win after the matches and not before the
matches.

 So there's no
>verifiable reasons for the test whatsoever. And the published reasons aren't
>analysed properly. Interesting.
>
>>In small part of their games(all the games of chessmaster no autoplayer is
>>involved because chessmaster does not support the autoplayer)
>
>I know, but it's too small a part to be significant. You can't blame SSDF for
>the "weaknesses" of Chessmaster.

I do not blame the ssdf for it.
I can trust more the results of chessmaster because of the fact that no
autoplayer was involved.
>
>>The question is if it is possible to compare result of the ssdf games with
>>result of people with one computer who do engine-engine games.
>>
>>In the case of the nunn match the between Crafty17.10 and Fritz6a the results
>>are similiar.
>
>The Nunn matches aren't similar to the SSDF games. They are _not_ comparable.

I agree but the result in the ssdf method(2 computers with autoplayer) and the
results in one computer(ponder off) give similiar results.

I do not see a reason that the picture is going to be different in tournament
time control games(except the rating difference between the programs).


>
>>There is no proof for diminishing return from speed and
>>I do not know if being 2 times slower in 1 minutes/game is more significant than
>>being 2 times slower in 2 hour/40 moves games.
>
>The strain wouldn't necessarily be consistent, so cpu fluctuation might (?)
>cause a program to loose on time or reach a lower search depth. That wouldn't be
>as noticeable in standard games IMO.

In the 4 games that I found in the ssdf games the strain was significant for all
the games.

There was a 5th game when Junior was slowed down only for the first move out of
book by a factor of 2 or 3 and in this case the tester did not want to repeat
the game.


>
>>chessfun test does not tell us significant results but the reason is that there
>>are not enough games.
>
>That isn't the only reason as I have explained before.
>
>>The relevance is that it is a reason to trust more chessfun's results because
>>she does not hide part of her games like the ssdf.
>
>You put public access before proper testing methods, but that isn't a
>scientifically viable solution. And why do you assume that she doesn't hide the
>games? I don't think that's probable, but that isn't the point. You focus on
>games, I focus on method and analysis, or the lack of it to be precise.

I do not think the method of the ssdf is better.
They have their own problems.

For example I found that Fritz5 won the same game against Rebel8 5 times because
Rebel has no learning function.

Fritz5 could have better rating if it played more games against Rebel8 so the
rating of programs is dependent on the number of games between 2 programs and
not only on the strength of the engines.

If they decide to let Fritz5(p200) to play 10000 games against Rebel8(p90) they
can push Fritz5(p200) to be number 1 above all the programs with better
hardware.

Uri

Uri



This page took 0.02 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.