Author: chandler yergin
Date: 15:29:10 02/01/06
Go up one level in this thread
On February 01, 2006 at 17:15:11, Dann Corbit wrote: >On February 01, 2006 at 16:44:07, chandler yergin wrote: > >>On February 01, 2006 at 16:16:25, Dann Corbit wrote: >> >>>On February 01, 2006 at 16:03:41, Uri Blass wrote: >>> >>>>On February 01, 2006 at 14:48:50, Dann Corbit wrote: >>>> >>>>>On February 01, 2006 at 13:27:44, Uri Blass wrote: >>>>> >>>>>>On February 01, 2006 at 13:08:44, Dann Corbit wrote: >>>>>> >>>>>>>On February 01, 2006 at 12:31:55, Uri Blass wrote: >>>>>>> >>>>>>>>On February 01, 2006 at 12:04:47, Dann Corbit wrote: >>>>>>>> >>>>>>>>>On February 01, 2006 at 11:14:36, David B Weller wrote: >>>>>>>>> >>>>>>>>>>I was just here trying to figure out why my engine doesnt get a certain bm for a >>>>>>>>>>positional test, and it occurred to me ... >>>>>>>>>> >>>>>>>>>>Why would I trust that? >>>>>>>>>> >>>>>>>>>>Many of the basic terms, eg., isolated pawn, have a fairly well established >>>>>>>>>>value, representing a statisitical average over many, many positions >>>>>>>>>> >>>>>>>>>>If my engine,is missing some positonal move, for no other reason than I can >>>>>>>>>>tell, except perhaps my isolated = 20 should be isolated = 25, then I am >>>>>>>>>>disregarding the trillions of other positions where it is, statistically >>>>>>>>>>speaking, really 20 >>>>>>>>>> >>>>>>>>>>As it has been pointed out many times, these tests suites are good only for >>>>>>>>>>detecting gross errors >>>>>>>>>> >>>>>>>>>>So if you plan on tweaking the value of your SE metrics by test suites, make >>>>>>>>>>sure it has about a million positions ;-) >>>>>>>>>> >>>>>>>>>>Maybe this is why 'auto' tuning is hard. Because if the suite doesnt contain >>>>>>>>>>enough data to be representative of all the features one is trying to tune, it >>>>>>>>>>will just be a waste of time, and make it worse... >>>>>>>>>> >>>>>>>>>>It could be that many problems can be easily solved, simply by inflating or >>>>>>>>>>deflating the right term(s). And certainly a 'genetic' algorithm would find the >>>>>>>>>>right ones to inflate/deflate on a small set of positions in order to get more >>>>>>>>>>of them right... >>>>>>>>>> >>>>>>>>>>Fact is, it could be the very reason the position got in the test suite, is >>>>>>>>>>because its is a little 'freakish'. Then what? We're tuning our engines to >>>>>>>>>>become worse! >>>>>>>>>> >>>>>>>>>>my $0.02 >>>>>>>>>> >>>>>>>>>>IMHO >>>>>>>>>> >>>>>>>>>>-David >>>>>>>>> >>>>>>>>>And yet the really good engines tend to solve all of them, or nearly all of >>>>>>>>>them. >>>>>>>> >>>>>>>>You are talking about tactical suites when david was talking about positional >>>>>>>>suites. >>>>>>>> >>>>>>>>> >>>>>>>>>Of course, an equal problem to test suites is that all of them are full of >>>>>>>>>outright mistakes and errors. >>>>>>>>> >>>>>>>>>Probably the best debugged suite is WAC and yet I imagine that it still contains >>>>>>>>>errors. >>>>>>>> >>>>>>>>I doubt if it is the best debugged suite. >>>>>>> >>>>>>>I am very sure of it. Every position has been analyzed by multiple strong >>>>>>>engines for long time control. No other suite has the same effort applied to it >>>>>>>as far as I know. >>>>>> >>>>>>I am surprised to read it because >>>>>>I think that programmers usually use WAC only at fast time control when they use >>>>>>other test suites at longer time control so common sense tells me that other >>>>>>test suites were probably tested more at long time control. >>>>>> >>>>>>I remember that I reported about some alternative solutions in arasan that were >>>>>>corrected. >>>>>> >>>>>>I also reported about some cases when there are additional solutions in ecmgcp. >>>>>> >>>>>>Note that if cooks mean more than one winning moves then I am also sure that >>>>>>there are many cooks in WAC. >>>>>> >>>>>>There are winning moves that it is clear that no good program is going to play >>>>>>and my opinion is that position can be considered as position with no errors >>>>>>even if it has more than one winning move as long as we can practically expect >>>>>>all programs to find the same move. >>>>> >>>>>Clearly we cannot expect it. If every program made the same move as the others >>>>>there would be no need even to play them against each other. And if one program >>>>>finds a different (and potentially even better) solution to a problem and yet is >>>>>scored as having failed the position, then clearly it is the position that is >>>>>broken and not the program. >>>> >>>>I think that there is better solution of 2 winning moves. >>>> >>>>If one move give mate in 2 and one move wins the queen then for me winning the >>>>queen is wrong solution for practical purposes even if I am sure that it wins >>>>the game because I expect strong programs not to find it. >>> >>>If there is a mate in 1 and a mate in 12000, then they are both solutions. >>>If one solution is a mate and the other is not, then the other [non-mate] may or >>>may not be a solution. >>> >>>>If you will check the WAC test by this way you may find that many solutions of >>>>it are not correct because the side that has mate in 2 can get rook advantage >>>>and win the game more slowly. >>> >>>If they are certain wins, then they are also solutions. The object of the game >>>is to win if you can win, else draw if you can draw. >>> >>>>>Every winning move (for a won position) is one of the solutions and if the >>>>>solutions are missing then the solution should be corrected. >>>> >>>>I will not be surprised if by your definitions most of the WAC positions should >>>>be corrected. >>> >>>They should be corrected if proven. >>By whom? The book is written. > >I am talking about the test suite and not about the book. Books can contain >mistakes. > >>>For instance, if one move is a mate and the other is not proven to be a mate, >>>then it is not a correction yet. >>Who is going to Print the correction? > >It does not matter if the book is corrected or not. The test suite can be >corrected. If someone wants to write it up, then that is a bonus. > >> But if it can be absolutely proven to win, >>>then it is an alternative solution. >>Why should an alternate solution be given any credbility? > >If it is proven to win then it is credible. > >>The "best" move leading to Mate IS the solution. > >If there are multiple moves leading to mate then there are multiple solutions. Yep, I guess that's called a "Cook" at least in Problem Compositions.. ;) >It is also possible for moves to be published as best moves that actually lead >to a direct loss. Indded! I know of at least one of these (not in WAC though). I know there are many so-called "Test Positions" that are completely wrong. I have Posted several. There >are some WAC positions that are questionable as to whether they win or not. Of course! I have Posted some of these too. But, I'm not going to dig through Archives.. Just Post some of the EPD's I have listed; and do the Analysis yourself. OK? Thanks, Chan
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.