Author: Heinz van Kempen
Date: 03:04:02 10/18/05
Go up one level in this thread
Hi Uri, in general I see no reason why CEGT has to justify here and I think you start this because you are not happy with the start of the Fruit 2.2 Uri test. But because you always gave a lot of feedback to our testing I will explain a bit. <<I do not know if you are correct and I doubt if you have enough games against different opponents to prove it(I explain later in this post why I doubt if it can be correct).>> For the CM settings I can only tell that you can ignore them if you do not like them and doubt their worth. There are a lot of fans including the testers we have for them and this is justification enough, even if results are not convincing that settings are much better. Fun with experimenting here is a factor. <<Unfortunately CEGT are not very interesting in comparison between different time control and I see only one chessmaster in 4/40 time control so we even have no evidence that all these personalities improve relative to the default when the time control is 40/40 relative to 40/4.>> CEGT is young. Not even one year old. The 40/4 games were started only about two or three months ago. So what do you expect? <<It is not related. I also did not suggest that the CEGT will stop testing. I did not claim that no testing is better than testing but only that I do not understand the choice of the CEGT.>> Of course the CEGT like the SSDF is free to test what they want and if the SSDF will also prefer to test 10 different personalities of one program it is their right and I will not suggest them to stop testing because of it. It is not the first time that I do not understand the choice of CEGT. I also did not understand the choice to do small number of blitz games relative to long time control.>> Any tester, anyone interested in engines would like to see other matches, other conditions, other engines. This is normal and the most difficult thing in a team for being agreed. There has to be unification, otherwise tests do not give statistical reliable results with many games. And you are right, we are free to do what was agreed in the team. This are all experienced testers and I am sure that a lot of useful things will be done in the future, if not some people will come and destroy all with unsound critics. <<The choice of blitz of 4/40 also seemed to me not very good and I thought that testers will prefer 2/40 for comparison with 40/40 but I read that some testers even prefered slower time control in the blitz games that is simply against all the idea of blitz games. The idea of blitz games is to compare between long time control and blitz to see if there are programs that are probably better in blitz. It may be possible to try to speculate from it about longer time control. As far as I know we usually see relatively small difference between 4/40 and 40/40 and it may suggest that the difference in time control should be more than 1:10 in order to see big difference so if there is no significant difference between CM default and other CM personality at 40/40 then I do not think that there is going to be a significant difference between CM default and other CM personality in a slower time control by factor of 2 or 3.>> Blitz has no priority in CEGT and is not accepted by many as a measurement. I had even difficulties to start 40/4, because others wanted more time even for Blitz. And as I said it is just started. We can also drop it again. Best Regards Heinz Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.