Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How to compare with Deep Blue '97 (methodically)?

Author: Uri Blass

Date: 19:22:52 08/02/01

Go up one level in this thread


On August 02, 2001 at 20:30:16, Mike S. wrote:

>There may be a chance to get a *rough estimation*, if a computer chess system is
>at (or even above) Deep Blue '97 level: If somebody would be capable to distil
>at least 10 good test positions from the 1997 match games. I can imagine that
>this could be done, supported by the Deep Blue logs which are downloadable
>somewhere on the net I think (I'm sure the URL is easy to find). I've heard they
>are somewhat difficult to read though (?).
>
>Preferably, we should search for "single move" situations, i.e. when D.B.
>recognised a subtle threat of Kasparov and found the clearly best defensive move
>early, or played such a threat itself, etc. We would need to find positions,
>which can suit as - very diffcult - test positions. The log data (hopefully)
>shows the time D.B. needed to find those moves each. I don't expect that more
>than 10 suitable positions can be found (if at all), which is a small number -
>but still much better than comparing node rates or whatever.
>
>Then, today's chess computer systems could be tested with that, and we would
>have at least some hard facts comparison instead of speculations. If a program
>can find let's say 8 or 9 out of 10 after similar, sometimes better time, I'd
>consider it is Deep Blue level. So we could compare performance... and you know
>it, only the performance counts! :o)
>
>Please give your opinions if this idea makes sense, which I want to read before
>I start searching those logs, analyzing, testing, etc. (hopefully the idea is
>nonsense and I can save the effort :o).

I think that the idea is not nonsense.
There was no hard tactical move to find but there are positional moves to find.

My suggestion is:
1)look at all positions from the match(deep blue to move)
or not from the match(Deep blue to ponder on moves that was not played).

2)choose from these positions only the positions when Deeper blue changed it's
mind fter more than 1 second.

3)Find from these positions all the positions when all top programs converge for
the same move that Deeper blue played when it is not trivial for them(most top
programs cannot do it in less than 1 second).

You need to give the top programs some hours for every position.

You can compare the times of top programs with the time of Deeper blue after you
find the relevant positions.

Note that this experiment is biased for Deeper blue because it contains only
positions when Deeper blue is probably right(all programs agree) but inspite of
this fact I do not expect Deeper blue to show clear superiority in this
experiment.

It is possible to get an estimate how much it is biased by doing the same
experiment for other programs(for example using shredder4's games against humans
in the israeli league to estimate if it is better or worse than programs like
Deep Fritz)


I checked in the past something similiar to get an estimate for the strength of
deeper blue.
I checked the times that programs need to see similiar pv to Deeper blue in some
positions and I found cases when Deep Fritz on PIII800 was only 2 or 3 times
slower than Deeper blue so my impression is that Deeper blue is not better than
deep fritz on good hardware.

Uri



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.