Author: Stephen A. Boak
Date: 01:27:01 07/25/03
Go up one level in this thread
On July 25, 2003 at 04:17:13, Stephen A. Boak wrote:
My last post was accidentally submitted before I finished editing it (sorry).
Below is a better (more complete & more completely edited) post.
>On July 25, 2003 at 03:22:25, Uri Blass wrote:
>
>>On July 25, 2003 at 01:58:41, Stephen A. Boak wrote:
>>
>>>Uri,
>>>Your postings in this thread are not understandable (to me), sorry.
>>>
>>>Please explain very carefully. Do not assume the reader knows what you are
>>>thinking, but be sure to explain very carefully everything of importance:
>>>
>>>1. The conditions of the hypothetical test (engines, time settings, etc);
>>
>>Only 2 modified Crafties.
>>2 accounts(if we want significant names)
>>Crafty0.1
>>Crafty3
>>
>>Crafty0.1 is modified to use always 0.1 seconds per move and not to ponder.
>>Crafty3 is modified to use always 3 seconds per move and not to ponder.
>>
>>
>>both have formula that let them to play only 5 3 time control or 15 9 time
>>control.
>>
>>Both do not limit opponents except maybe limitation of difference in rating of
>>not more than 300 elo.
>>
>>>2. The hypothethical results of the hypothetical test;
>>
>>Crafty0.1
>>2300 blitz
>>2200 standard
>>
>>Crafty3
>>2900 blitz
>>2500 standard
>>
>>>3. The conclusions you would draw from those hypothetical results of the
>>>hypothetical test.
>>
>>The difference in rating may be misleading
>
>1. Misleading for what purpose?
>
>If you erroneously try to read too much into a result, then you should *expect*
>to be misled.
>
>Assume your hypothetical test & hypothetical results are real:
>2A. What knowledge *can* properly (logically) be deduced from the data?
>
>2B. What knowledge cannot properly (logically) be deduced from from the data?
>
>2C. Does it matter (for your same purpose--see above) whether the resulting Elo
>values are based on 1 game, 10 games, 100 games, 1000 games, etc?
>
> 1) How would this affect what:
> a. logically *can* and
b. logically *can not*
be deduced from the data?
>
>2D. Does it matter (for your same purpose--see above) whether the resulting Elo
>values are based on having played the *exact same opponents*?
>
> and 600 elo in blitz may be
>>equivalent to 300 elo in standard.
>
>There is a problem with 'logical' deduction if there is no equivalency in
>reality.
>
3. What if *some* programs [randomly] have a greater gain in going from 0.1 sec
/ move to 3 sec per move,
but *some other* programs [randomly] have a loss in going from 0.1 sec / move to
3 sec / move?
4. What if the standard deviation for a reported Blitz Elo score is different
than the standard deviation for a reported Standard Elo score for the **same**
program?
5. How do these considerations (3 & 4, above) affect what *can* and *cannot* be
properly deduced from the results of the experiment?
Thanks,
--Steve
>>
>>Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.