Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Statistics errors in opening books created with the Shredder8/Fritz-GUI

Author: William Penn

Date: 11:31:10 12/29/04

Go up one level in this thread


On December 28, 2004 at 23:09:24, Norm Pollock wrote:

>On December 28, 2004 at 22:10:45, William Penn wrote:
>
>>On December 28, 2004 at 21:25:59, Stephen A. Boak wrote:
>>
>>>On December 28, 2004 at 20:54:45, William Penn wrote:
>>>
>>>>Statistics errors in opening books created with Shredder8/Fritz-GUI
>>>>-------------------------------------------------------------------
>>>>
>>>>I have been creating opening books for quite awhile with the Shredder 8
>>>>CB/Fritz-GUI. [I'm talking about the same GUI that comes with Fritz 8, Junior 8,
>>>>etc. - not the "Classic" Shredder GUI. Too bad such ambiguities exist, but
>>>>hopefully that is clear enough.]
>>>>
>>>>The procedure (from the chessboard window in the GUI) is:
>>>>1) File menu, New, Openings book... (creates a new opening book)
>>>>2) Edit menu, Openings Book, Import Games... (imports games from any desired
>>>>database)
>>>>
>>>>The resulting opening book can be displayed in the Book Pane. That is available
>>>>by default as one of the tabs in the Notation pane, along with the Notation and
>>>>Scoresheet tabs. Or it can be displayed separately by selecting "Extra Book
>>>>Pane" from the Window menu, Panes...
>>>>
>>>>For any given position on the chessboard, the opening book shows the number of
>>>>games, percentage scored, average rating, performance, and probabilities -
>>>>overall, and for each of the possible moves in the position from the database
>>>>involved. A histogram (bar graph) can also be displayed at the bottom of the
>>>>Book Pane with the number of white wins, draws, black wins, percentages, etc.
>>>>
>>>>Much to my dismay, I have just discovered that the data and statistics shown are
>>>>inaccurate. There must be a bug somewhere.
>>>>
>>>>While the total number of games in a given position appears to be correct in all
>>>>cases, the other statistics & data for the variations is incorrect. For example
>>>>there may actually only be one (1) game in the database for a particular move
>>>>from the position, but the Book Pane may say that there are 3 games, or 5 games,
>>>>etc. with corresponding incorrect statistics. Search of the database involved
>>>>for that position proves that the displayed statistics are incorrect, which is
>>>>easy to do when only a few games are involved.
>>>>
>>>>This error always displays more games than actually exist in the database.
>>>>Exactly where the "extra" games come from is a mystery, because they do not
>>>>exist in the database. It could be some kind of tree-building error when the
>>>>games are imported. Or it could be an error in how the data is displayed (more
>>>>of a GUI-based error). I don't know which.
>>>>
>>>>I also don't know if the error relates to the Length designated when importing
>>>>games. That dialog box which appears asks you to select an Absolute length or an
>>>>ECO-relative length of 1-100. This determines how big the tree is. The larger
>>>>the Length selected, then the bigger (more MB or GB) the resulting opening book.
>>>>I usually choose a Length in the range of 15-35, depending on the size of the
>>>>database.
>>>>
>>>>I also don't know if the error relates to the size of the database involved.
>>>>Most of my databases are large, the largest over 3,000,000 games. Or perhaps
>>>>there is some limit on the size of the resulting book files, but I don't know.
>>>>
>>>>At this point, I also don't know exactly how bad the error is. If it's only a
>>>>handful of games, it wouldn't matter very much when the sample of games in a
>>>>given position is large.
>>>>
>>>>The uncertainties involved makes the resulting statistics practically worthless,
>>>>pending further research. Too bad, because it could be quite valuable if
>>>>accurate based on the database involved.
>>>>
>>>>At this point, I'm not aware of any way to avoid this bug. Does anyone know how
>>>>it can be avoided?
>>>>
>>>>My apology for saying "I don't know" so often, but alas there is no detailed
>>>>documentation or instructions for these operations insofar as I'm aware. You
>>>>just have to discover them on your own, then try to get lucky, I guess. Alas
>>>>that is typical of Chessbase documentation. They assume you already know
>>>>everything.
>>>>WP
>>>
>>>I have detected similar puzzles in the past.
>>>
>>>Some thoughts (ideas only):
>>>
>>>1. Transpositions may mean the position arrived at after making the displayed
>>>tree move was reached from different prior positions.
>>>
>>>Thus a search from the current position may only show one instance.
>>>
>>>2. Some transpositions may be included only in the notes within a game, but not
>>>with a separately saved database game entry.
>>>
>>>3. Reversed color openings may be included in the statistics, but a search may
>>>not find them, even if "equivalent".
>>>
>>>I suppose a test, creating a small database and corresponding tree, *may* show
>>>something, but then the problem may be hard to replicate.
>>>
>>>Let us know if you learn anything more.
>>>
>>>Regards,
>>>--Steve
>>
>>Thanks for your thoughts. None of my databases include any annotations, only raw
>>game scores, which rules out that possible source of error at least. Here is a
>>very simple example of how the Book Pane appears for a particular opening book
>>and position, which I happen to be studying at the present time:
>>
>>The Book Pane says there are a total of 2 games in the underlying database for
>>the current chessboard position. That is correct.
>>
>>It also shows there is 1 game in the database after the move 9...a6 is made.
>>That is also correct.
>>
>>Now the error...
>>It says there are 5 games in the database after the move 9...d6 is made. That is
>>divided into 2 white wins, 1 draw, and 2 black wins. That is very incorrect! In
>>fact, there is only 1 game in the database after 9...d6 (a draw). Where could it
>>possibly find the 4 extra games, which do not exist in the database from which
>>the opening book was prepared?
>>WP
>
>Did you have book learning on? Games played with book learning on will adjust
>the stats.

Thanks for your suggestion, however I know zilch about "book learning". I've
never accessed nor changed any settings related to book learning. But at your
prompting, I've now just read the relevant section in the Shredder 8
Fritz/CB-GUI's Help. As best as I can understand, book learning is not relevant
to these statistics errors in the opening books I have prepared. Please correct
me if I'm wrong in that regard...

I do not use Shredder 8 to play games of chess. I only use it to analyze
positions for long periods of time in infinite analysis mode. How then could
book learning become involved in the statistics that was compiled from a
particular database of games? Note also that I have many opening books compiled
from many different databases, and they all show these same kinds of statistics
errors.
WP




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.