Author: M Hurd
Date: 11:31:33 01/19/06
Go up one level in this thread
On January 19, 2006 at 09:39:00, Eelco de Groot wrote: >On January 19, 2006 at 09:05:03, M Hurd wrote: > >>On January 19, 2006 at 08:52:00, Ricardo Gibert wrote: >> >>>On January 19, 2006 at 08:36:03, M Hurd wrote: >>> >>>>On January 19, 2006 at 08:30:55, Ricardo Gibert wrote: >>>> >>>>>On January 19, 2006 at 08:11:54, M Hurd wrote: >>>>> >>>>>>If you play an engine match of 1000 games against 1 engine and play another >>>>>>match of 1 game each against 1000 engines, would you get the same rating ? >>>>>> >>>>>>Is it more important to play as many different engines as possible or just >>>>>>number of games played. >>>>> >>>>>Depends on what your are trying to measure. Relative strength to one particular >>>>>engine or general strength against engines in general. >>>>> >>>>>> >>>>>>Presumably there will be an optimum number for games and number of engines >>>>>>played. >>>>> >>>>>Theoretically, the optimal number approaches infinity in both cases. Naturally, >>>>>this has virtually no practical value. You will need to be more specific to get >>>>>a more useable response. >>>>> >>>>>> >>>>>>Regards >>>>>> >>>>>>Mike >>>> >>>> >>>>Hi Ricardo >>>> >>>>I was simply wondering what would likely be the ELO difference between the 2 >>>>matches I outlined and which match would be the more accurate. >>> >>>Accurate in what sense? The 2 matches answer 2 different questions. What >>>precisely are you trying to measure? My guess is you want to measure general >>>playing strength rather than the relative strength between 2 particular engines. >>>If that is the case, given those choices, this isn't a close call. One game >>>against each of 1000 different engines is the way to go. >>> >>>Frankly, this ought to be obvious. >>> >>>> >>>>Regards >>>> >>>>Mike >> >> >>Frankly this is not obvious to me. >> >>If you play 1 game with 1 engine versus another you will get a result however >>this could be a win loss or draw and tells you nothing. 1000 x nothing = nothing >>where as 1000 games against 1 engine should give a more confident rating. >> >>Regards >> >>Mike > >Hello Mike, > >That makes no difference, any game tells you just as much no matter which >opponent it is. For the rating (the TPR rating in this case) you simply compute >the average result against the average rating of all the opponents. > >You get a better idea of the strength against all the different opponents if you >play some (or just one) game against many of them, not just against one. >That is because a rating is not a perfect predictor, some players will just have >bad results against some of the possible opponents, their Angstgegners if you >like. Also the average opponent-rating is a more dependable number than the >rating of just one member of the group (there is less uncertainty involved >because more game were played to compute the average) > >The situation is a bit more complex if the rating of your opponent (programs) is >not very well known, or even unknown. Playing one or more games does not tell >you anything about rating then, only about the difference in rating between the >two. Therefore it becomes necessary to add to your tournament at least one but >preferably more opponents with a known rating, and let each of the unrated >players play against each other but also against the known ratings. Then you can >calculate all of the ratings with a succesive approximation process. > >hope it makes some sense.. > > Eelco Thanks for the explanation. Hypertheticaly speaking Fritz plays Rybka 1000 times and a rating for fritz is calulated based on the results of the games assuming Rybka's rating is known. Fritz then plays 1 game against 1000 engines with known ratings and a rating is calculated. Which rating would be nearer to Fritz's likely rating or would they be the same, hypertheticaly speaking. Regards Mike
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.