Author: Peter Skinner
Date: 22:18:19 03/02/04
Go up one level in this thread
On March 02, 2004 at 21:13:51, Dann Corbit wrote: >Notice these entries from the SSDF list: > Rating + - Games Won Av.opp >11 Chess Tiger 15.0 256MB Athlon 1200 MHz 2719 23 -22 968 59% 2655 >... >13 Chess Tiger 14.0 CB 256MB Athlon 1200 2717 30 -30 557 61% 2638 > >The ratings are very close. I imagine that the evaluations will be similar. >Does that somehow indicate fraud to you? It would depend. If version 15.0 was advertised as a "50 point elo" increase from 14.0, then yes I would consider that fraud. > >And now look at this: >25 Gandalf 4.32h 256MB Athlon 1200 MHz 2658 31 -31 514 53% 2635 >... >27 Gandalf 5.0 256MB Athlon 1200 MHz 2649 45 -46 242 44% 2692 >28 Gandalf 5.1 256MB Athlon 1200 MHz 2637 25 -25 758 55% 2604 > >Notice that newer versions may even be slightly weaker than older versions >(though the difference is not statistically significant). Does that indicate >fraud to you? See above answers... >All that it means to me is that it is very difficult to make a strong program >stronger. I am sure that an author who makes a new release of his program >imagines it to be better, and significantly so. The testing done by an author >may not get the same results as the testing done by an independent >organization. > >In my view, falsely accusing someone of fraud is as bad as committing fraud. > >Hinting that someone may have committed fraud is not as bad as that. But it >still is not a very pleasant thing to do. > >IMO-YMMV. I have not once said that I think he did. I was looking at data that does suggest something _could_ be awry. I did state that I did not think so, and I _hoped_ it wasn't the case. Personally I love proving "advertising" wrong. It is sort of a hobby. I hate advertising that is misleading, and I have even went as far as to quit a job because of the bad advertising that company did. I believe it was Frank Quinsky who stated here in this very forum that Ruffian 2.0.0 was "100 elo" better than 1.0.5. That was _obviously_ misleading, and completely untrue. It does not take a rocket scientist to look through the advertising, the optimizations, the comments of a new evaluation technique to see that certain free version come to the same conclusion as their commercial counter-parts. It also is reasonable to conclude that the commercial version are not indeed 100 elo better than the free counterparts. Certainly there is confusion why from 2.0.0 we now have two upgrades, smaller in exe size, yet all seem to suffer from the ponder bug. Even the older free versions have the same bug. How does one go from 1.0.1 to 2.1.0 without fixing that bug. It puzzles me.. The new versions could be just that, but there is some evidence that they are not. Whether than evidence is conclusive has yet to be seen. I have went on record as stating I am not accusing Per-Ola of anything, as I have spoken with him online and I don't think he would do something like this. Peter.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.