Author: Uri Blass
Date: 20:15:54 08/02/01
Go up one level in this thread
On August 02, 2001 at 22:45:10, Dann Corbit wrote: >On August 02, 2001 at 22:22:52, Uri Blass wrote: > >>On August 02, 2001 at 20:30:16, Mike S. wrote: >> >>>There may be a chance to get a *rough estimation*, if a computer chess system is >>>at (or even above) Deep Blue '97 level: If somebody would be capable to distil >>>at least 10 good test positions from the 1997 match games. I can imagine that >>>this could be done, supported by the Deep Blue logs which are downloadable >>>somewhere on the net I think (I'm sure the URL is easy to find). I've heard they >>>are somewhat difficult to read though (?). >>> >>>Preferably, we should search for "single move" situations, i.e. when D.B. >>>recognised a subtle threat of Kasparov and found the clearly best defensive move >>>early, or played such a threat itself, etc. We would need to find positions, >>>which can suit as - very diffcult - test positions. The log data (hopefully) >>>shows the time D.B. needed to find those moves each. I don't expect that more >>>than 10 suitable positions can be found (if at all), which is a small number - >>>but still much better than comparing node rates or whatever. >>> >>>Then, today's chess computer systems could be tested with that, and we would >>>have at least some hard facts comparison instead of speculations. If a program >>>can find let's say 8 or 9 out of 10 after similar, sometimes better time, I'd >>>consider it is Deep Blue level. So we could compare performance... and you know >>>it, only the performance counts! :o) >>> >>>Please give your opinions if this idea makes sense, which I want to read before >>>I start searching those logs, analyzing, testing, etc. (hopefully the idea is >>>nonsense and I can save the effort :o). >> >>I think that the idea is not nonsense. >>There was no hard tactical move to find but there are positional moves to find. >> >>My suggestion is: >>1)look at all positions from the match(deep blue to move) >>or not from the match(Deep blue to ponder on moves that was not played). >> >>2)choose from these positions only the positions when Deeper blue changed it's >>mind fter more than 1 second. >> >>3)Find from these positions all the positions when all top programs converge for >>the same move that Deeper blue played when it is not trivial for them(most top >>programs cannot do it in less than 1 second). >> >>You need to give the top programs some hours for every position. >> >>You can compare the times of top programs with the time of Deeper blue after you >>find the relevant positions. >> >>Note that this experiment is biased for Deeper blue because it contains only >>positions when Deeper blue is probably right(all programs agree) but inspite of >>this fact I do not expect Deeper blue to show clear superiority in this >>experiment. >> >>It is possible to get an estimate how much it is biased by doing the same >>experiment for other programs(for example using shredder4's games against humans >>in the israeli league to estimate if it is better or worse than programs like >>Deep Fritz) >> >> >>I checked in the past something similiar to get an estimate for the strength of >>deeper blue. >>I checked the times that programs need to see similiar pv to Deeper blue in some >>positions and I found cases when Deep Fritz on PIII800 was only 2 or 3 times >>slower than Deeper blue so my impression is that Deeper blue is not better than >>deep fritz on good hardware. > >I think it is absurd to try to judge the strength of a program from 100 games. >Will we do it from 100 moves? > >Sometimes, programs that I work on may make a smart move. Often -- for the >wrong reason altogether. > >What happens if a move is so brilliant that nobody gets it except the machine/GM >who made it? Unfortunately we have to ignore these moves and use only the moves that everybody can find after a long time. Everybody should include Deeper blue in order to ignore possible cases when Deeper blue made a move that is so brilliant or simply wrong so no program can find it. > >I just don't believe that this approach works. > > >On the other hand, there are a lot of people who agree with you. I don't know >how many times I have heard someone say they know exactly how strong a program >is by simply examining the moves of one game. > >Surely you will agree (at least) that any such measures are purely subjective. I do not suggest only seeing the moves but also analyzing them by giving programs many hours and this is the difference between me and other people. I agree that we cannot be sure about the strength of the program by this analysis but we can get an estimate for it. Uri
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.