Computer Chess Club Archives

Search

Terms

Messages

Subject: Re: Junior's long lines: more data about this....

Author: Amir Ban

Date: 02:16:37 01/08/98

On January 08, 1998 at 01:27:49, Don Dailey wrote:

>I once built a table from self play data that showed win expectancies
>from scores.  I simply remembered each root node evaluation and
>graphed them all.
>
>It would be fun to do the same with Grandmaster games, just to see
>if the percentages come out the same,  in other words does evaluating
>a position 1.0 pawn up exactly predict the win percentages the same
>from GM samples  as from Cilkchess played games?   If this came out
>significantly different it would be interesting to analyze why.
>
>A further extension is to do it with games played at various levels.
>Does  +1.0 with 1k nodes search = +1.0 with 2k nodes searched when
>predicting win percentages?
>
>- Don

I did this recently. I put all games from the WMCCC through shallow
evalution and matched the score with the game outcome. I then plotted it
in Excel and did a best-fit of the percentages with some exponential.
The fit looked visually good, and the number of odd results (that is,
scores of +4 or so which ended in  a draw or loss) was small, even
though I was doing only about a 2-ply search to evaluate. The best-fit
constant was around 1.05 pawn.

I was encouraged by this, so I did the same to the Groningen games, and
found that I am looking at a mess. The plot looked much more noisy, the
best-fit graph not very convincing, the best-fit constant higher at
around 1.50 pawn, and there was a big number of scores above 3.00 that
failed to win, and they were hopelessly distorting the picture.

I realized that this in part was caused by blunders, so I went hunting
for some of those results, and when it turned out to be a blunder I
deleted the game. Later I deleted all the rapid games. This didn't make
things noticeably better. There was at least one Shirov game where I
think he was doing objectively fine with scores of -4.

I got the feeling that blunders are not the whole story, and there are
real differences, and this is worth studying. You need to have a
blunder-free database of games to do some serious study. It seems that
Groningen was not the right place to look for that.

Amir

This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.