Author: Ulrich Tuerke
Date: 02:25:09 01/07/03
Go up one level in this thread
On January 06, 2003 at 16:56:35, Tom King wrote: >Hi all, > >What do people think about playing different versions of your program against >each other as a way of testing? > >I'm playing around with it right now, between v0.07 and a newer version of my >program. The newer version is winning handsomely: +24,=18,-10. > >This implies a reasonably impressive increase in strength, almost 100 ELO. Ok, >ok, it's a small sample, so the margin of error could be big. > >However, my gut feel is that playing different versions of your programs tends >to overstate the strength differences. What do people think? > >Rgds, >Tom > >tom@silentshark.co.uk Hi Tom, my "improvements" usually look much more modest. (Hopefully, the aren't an illusion. -:) ) IMHO, it's very difficult to evaluate the result of a program change. I think that matches against previous versions follow its "own rules". Nevertheless, I think that they must be done in order to exclude major flaws. I don't think that differences will be exaggerated this way. However, I have observed that you have to play really LOTS of games in order to find out which version is doing better. It seems, I'm not as luck as you are, because often the weaker version will lead in the 1st 20 games and ubsequently - for me often quite unexpected, just before I'm going to throw the towel - the stronger version can make it. I personally like to use the Fritz-GUI in order to test different versions. Both get the same, small book. The GUI is so clever, to replay every game with opponents exchanged. So, randomness through book is kind of minimized. Uli
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.