Author: Uri Blass
Date: 04:55:18 01/15/06
Go up one level in this thread
On January 15, 2006 at 07:10:41, Stephen A. Boak wrote: >On January 15, 2006 at 04:56:15, Uri Blass wrote: > >>On January 15, 2006 at 02:07:06, Marc Lacrosse wrote: >> >>>> >>>>Lacrosse's analysis showed above all that in the 87 positions he tested, that >>>>Shredder 9 and Rybka scored 57% given 10 seconds, and Fruit and Toga and company >>>>are much weaker with so little time, and thus much weaker in blitz. >>> >>>> >>>> Albert >>> >>>Just a little point, Albert. >>> >>>What my little experience shows is not an argument for telling that engine A is >>>better or worse than engine B at faster or slower time control. >>> >>>What I precisely did is the following : >>>let say : >>>- engine A solves "x" positions in 180 seconds and >>>- engine B solves "y" positions in 18o seconds. >>>I recorded: >>>- what percentage of "x" engine A had already solved after 10 seconds >>>- what percentage of "y" engine B had already solved after 10 seconds >>> >>>So each engine is compared at 10 seconds with the number of positions that it >>>will solve _itself_ at 180 seconds >>> >>>So when I record that Rybka has a 57% score and Fruit a 39%, this does _not_ say >>>that Rybka is "stronger" or "weaker" than Fruit, and we could have a much weaker >>>1800 elo engine getting a 80% (or a 15%) score in the same test. >>> >>>What the little test tends to show is just that rybka has already shown 57% of >>>its own analysis capacity at 10 seconds whereas Fruit has a larger margin of >>>improvement (compared with itself) when given a larger time control. >>> >>>Marc >> >>Your experiment show nothing > >> >>imagine that there are 100 problems >> >>imagine that engine B need square root of the time of engine A to solve >>positions. >> >>engine A solves problem number n in 4n seconds for n<45 and >>problem number n in 1000n seconds for n>=45 >> >>engine A solves 2 problems in 10 seconds and 44 problems in 180 seconds. >> >>Engine B solves problem n in sqrt(4n) seconds for n<45 and in sqrt(1000n) >>seconds for n>=45 >> >>engine B solve 25 problems in 10 seconds and 44 problems in 180 seconds. >> >>engine B improve less than engine A by your test because 44/2 is bigger than >>44/25 but it clear than engine B improves more than engine A based on the times. >> >>My point is that you cannot compare number of solution in x seconds with number >>of solutions in y seconds and get conclusions. >> >>The only logical comparison is comaparison of time to solve x solutions and time >>to solve y solutions and you did not do that comparison. >> >>Uri > >Hi Uri, > >I'm tired, and I haven't studied your above figures very much, so I'll look at >them again later, after I've slept. > >But I have to ask, how can you make up *hypothetical* numbers and draw any >conclusions? This seems far less logical than Marc's *real* experiment that >obtains *real* figures and reports them as is. > >I did not see Marc draw any mathematical conclusions (above) that oppose your >own conclusions. Instead, he only seemed to describe his test & explain the >reported results. > >To the contrary, Marc carefully points out: > >" ... when I record that Rybka has a 57% score and Fruit a 39%, this does _not_ >say that Rybka is "stronger" or "weaker" than Fruit ...". > >Why do *you* create a 'strawman', i.e artificial premise (unstated conclusion), >attribute it to *Marc*, and then shoot it down? > >Data gathering (experimenting) is simply data gathering. It is one of the most >important tools of science. It *never* proves something--so why critize the >gathering & reporting. The main problem is that I can learn nothing important from the data that marc gave. I do not know the problems. I know that Fruit improved more than rybka when rybka solved more positions than fruit in both (10 seconds per move and 180 seconds per move). The problem is that by choosing the right set of positions I can always show that the weaker program improved more. take a simple example without numbers. In an easy set of problem the weak program solved half of them in 10 seconds and all of them in 180 seconds when the strong program solved all of them in 10 seconds. You can say that the weaker program improved more because it solved twice more position when the stronger did not improve because it solved the same. This claim is simply not convincing. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.