Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Fritz 6 vs H 7.32 > Scores at 10-20-50-100-150-200-250&306 games!

Author: Harald Faber

Date: 06:10:31 07/29/00

Go up one level in this thread


On July 28, 2000 at 21:58:41, James Thompson wrote:

>On July 28, 2000 at 10:20:22, Harald Faber wrote:
>
>>On July 28, 2000 at 09:29:29, James Thompson wrote:
>>
>>>
>>>>>>                      Fritz 6   H7.32
>>>>>> Scores at 10 games :   2.5      7.5
>>>>>>    "      20   "   :  10.0     10.0
>>>>>>    "      50   "   :  26.5     23.5
>>>>>>    "      100  "   :  52.0     48.0
>>>>>>    "      150  "   :  79.5     70.5
>>>>>>    "      200  "   : 105.5     94.5
>>>>>>    "      250  "   : 130.5    119.5
>>>>>>    "      306  "   : 158.0    148.0
>>>>>
>>>>>  I would like to comment from the scores that it proves that at a 20 game match
>>>>>it doesn't mean anything! Between 50 and 100 games the score for Fritz moves up only slightly from a +3 to +4 points in its favor. But at 150 games the score
>>>>>climbs considerably to a +9 point advantage over Hiarcs7.32. Then it starts
>>>>>tapering off between the 200 and 250 games with a point advantage of +11 for
>>>>>Fritz 6 and then drops to a +10 point advantage at 306 games. So, for two -
>>>>>engines that are close to the same strength it shows that you need 50 to 100
>>>>>games, and if the engines are "extremly" close in strength it would be wise to
>>>>>play anywhere from 150 to 200 games!
>>>
>>>I understand that you need a large enough sample to get a fair assessment of the
>>>strength of one machine versus another but I'm having a problem interpretinng
>>>the results as presented.   Each game is independent of the previous game, thus
>>>you are sampling a population "with replacement".  If that's the case the margin
>>>of wins wouldn't flip would it?  UNLESS something else affected the results,
>>>e.g. the openings played or the color each machine played. Naturally I'm
>>>assuming "engine parameters" are held constant from game to game.  Assuming this
>>>is correct wouldn't it be possible and a more accurate assessment to determine
>>>that one engine is stronger when playing a particluar color or a particular
>>>line(s)?  Has anyone done that kink of analysis?
>>>
>>>James
>>
>>
>>Of course the games need to have different opening lines.
>>Do you suggest playing ONE opening line with white and black against all other
>>opponents? Then continue with another opening line and so on? Yes, that would be
>>perfect but takes a LOT of time noone is able nor willing to spend.
>>And you will also have to assemble all those results because it is not unlikely
>>that one or another program are favoured by one special opening line or even
>>complete opening therefore one needs to have various opening lines with this
>>experiment.
>
>Actually I was suggesting doing  little more with the analysis of the results.
>For instance being able to demonstrate a propensity for a certain color or line
>(more success attacking or defending).  But you make a good point about the time
>required to complete such a detailed analysis.  Maybe it's not worth it...
>
>James


Exactly; in the meantime there is at least one new version of each program
available...



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.