Author: Robert Hyatt
Date: 17:07:27 09/14/00
Go up one level in this thread
On September 14, 2000 at 12:51:50, James T. Walker wrote: >At the risk of being on the wrong side of programmers, I have to agree with >Enrique. I think programmers, especially commercial programmers put a lot of >emphasis on the World Championship because of the extra sales it might bring. >In my opinion it is just another tournament with a lot at stake. Testing >programs like the SSDF does is a more "real world" situation. Still the best >test of all would be hundreds of games vs humans. But even better than that is >that you test it yourself and decide if it does what YOU want. >Jim Walker It really isn't "real world". IE when you buy a new engine to play against, how often do you play hundreds of games? The SSDF test method tests one particular aspect of an engine quite well: how will it adapt after it wins or loses a game? But if you are playing just one game against Kasparov, that aspect is totally unimportant. If you are playing in a WMCCC event, you might well spend a lot of time preparing book lines to kill your opponents. I don't have time for this so I usually try to prepare lines to take the opponent out of book before his book can kill me (I didn't have time for this in the recent WMCCC event). So the WMCCC event tests programmer preparation more than any other factor. I think ICC is the _hardest_ test to pass. You will play humans at IM and GM strength, and they will play hundreds of games, trying to crack your book, or find a positional weakness they can exploit, and then they will do so over and over until you fix it. Opening preparation is no good there if you play 100% automatically. You _must_ have some randomness or you get killed no matter how good your "good lines" are. I think it is a question of 'benchmarking'. When someone asks _me_ which processor to buy, I don't randomly say "buy Intel" or "buy AMD". I _always_ say "benchmark the software you want to run and use the result to choose." Because different programs respond differently to different processors. And the processor you get the best results on won't necessarily be the one I get the best results on. Similarly, you should test an engine in the environment you expect to use it in. If you want to annotate/analyze your games for errors, who cares how its "learning" works? If you want to run automatically on the servers using one of the new auto-interfaces that are available, you had _better_ have learning facilities or you get cooked, but good. If your goal is to troll around ICC trying to produce the most inflated rating you can, you should probably choose the program at the top of the SSDF list (for the record, I dislike ICC computer accounts that run multiple programs.... it is not a reasonable thing to do under one account, any more than it would be reasonable to have a GM, and IM and a patzer all playing using one handle.) If your intent is to find the strongest opponent for yourself or to use against a human, then the SSDF list is not the place to look. That's about as simply as I can explain what ought to be going on when someone looks at the various results. Just because a program is on top of the SSDF does _not_ mean it will do the best against GM players. Just because a program does better than others against GM players does not mean it will do better on the SSDF list. And you can add any sub-combinations of the above that you like to the discussion. Some SSDF games are played in long matches where learning is critical. Some are played as single games where learning is not used at all. Learning doesn't flow across two different testers either. Ditto on the chess servers and at WMCCC events... All very cloudy, IMHO. Hard to see the facts through the heavy "fog"...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.