Author: Harald Faber
Date: 06:10:31 07/29/00
Go up one level in this thread
On July 28, 2000 at 21:58:41, James Thompson wrote: >On July 28, 2000 at 10:20:22, Harald Faber wrote: > >>On July 28, 2000 at 09:29:29, James Thompson wrote: >> >>> >>>>>> Fritz 6 H7.32 >>>>>> Scores at 10 games : 2.5 7.5 >>>>>> " 20 " : 10.0 10.0 >>>>>> " 50 " : 26.5 23.5 >>>>>> " 100 " : 52.0 48.0 >>>>>> " 150 " : 79.5 70.5 >>>>>> " 200 " : 105.5 94.5 >>>>>> " 250 " : 130.5 119.5 >>>>>> " 306 " : 158.0 148.0 >>>>> >>>>> I would like to comment from the scores that it proves that at a 20 game match >>>>>it doesn't mean anything! Between 50 and 100 games the score for Fritz moves up only slightly from a +3 to +4 points in its favor. But at 150 games the score >>>>>climbs considerably to a +9 point advantage over Hiarcs7.32. Then it starts >>>>>tapering off between the 200 and 250 games with a point advantage of +11 for >>>>>Fritz 6 and then drops to a +10 point advantage at 306 games. So, for two - >>>>>engines that are close to the same strength it shows that you need 50 to 100 >>>>>games, and if the engines are "extremly" close in strength it would be wise to >>>>>play anywhere from 150 to 200 games! >>> >>>I understand that you need a large enough sample to get a fair assessment of the >>>strength of one machine versus another but I'm having a problem interpretinng >>>the results as presented. Each game is independent of the previous game, thus >>>you are sampling a population "with replacement". If that's the case the margin >>>of wins wouldn't flip would it? UNLESS something else affected the results, >>>e.g. the openings played or the color each machine played. Naturally I'm >>>assuming "engine parameters" are held constant from game to game. Assuming this >>>is correct wouldn't it be possible and a more accurate assessment to determine >>>that one engine is stronger when playing a particluar color or a particular >>>line(s)? Has anyone done that kink of analysis? >>> >>>James >> >> >>Of course the games need to have different opening lines. >>Do you suggest playing ONE opening line with white and black against all other >>opponents? Then continue with another opening line and so on? Yes, that would be >>perfect but takes a LOT of time noone is able nor willing to spend. >>And you will also have to assemble all those results because it is not unlikely >>that one or another program are favoured by one special opening line or even >>complete opening therefore one needs to have various opening lines with this >>experiment. > >Actually I was suggesting doing little more with the analysis of the results. >For instance being able to demonstrate a propensity for a certain color or line >(more success attacking or defending). But you make a good point about the time >required to complete such a detailed analysis. Maybe it's not worth it... > >James Exactly; in the meantime there is at least one new version of each program available...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.