Author: Norm Pollock
Date: 20:09:24 12/28/04
Go up one level in this thread
On December 28, 2004 at 22:10:45, William Penn wrote: >On December 28, 2004 at 21:25:59, Stephen A. Boak wrote: > >>On December 28, 2004 at 20:54:45, William Penn wrote: >> >>>Statistics errors in opening books created with Shredder8/Fritz-GUI >>>------------------------------------------------------------------- >>> >>>I have been creating opening books for quite awhile with the Shredder 8 >>>CB/Fritz-GUI. [I'm talking about the same GUI that comes with Fritz 8, Junior 8, >>>etc. - not the "Classic" Shredder GUI. Too bad such ambiguities exist, but >>>hopefully that is clear enough.] >>> >>>The procedure (from the chessboard window in the GUI) is: >>>1) File menu, New, Openings book... (creates a new opening book) >>>2) Edit menu, Openings Book, Import Games... (imports games from any desired >>>database) >>> >>>The resulting opening book can be displayed in the Book Pane. That is available >>>by default as one of the tabs in the Notation pane, along with the Notation and >>>Scoresheet tabs. Or it can be displayed separately by selecting "Extra Book >>>Pane" from the Window menu, Panes... >>> >>>For any given position on the chessboard, the opening book shows the number of >>>games, percentage scored, average rating, performance, and probabilities - >>>overall, and for each of the possible moves in the position from the database >>>involved. A histogram (bar graph) can also be displayed at the bottom of the >>>Book Pane with the number of white wins, draws, black wins, percentages, etc. >>> >>>Much to my dismay, I have just discovered that the data and statistics shown are >>>inaccurate. There must be a bug somewhere. >>> >>>While the total number of games in a given position appears to be correct in all >>>cases, the other statistics & data for the variations is incorrect. For example >>>there may actually only be one (1) game in the database for a particular move >>>from the position, but the Book Pane may say that there are 3 games, or 5 games, >>>etc. with corresponding incorrect statistics. Search of the database involved >>>for that position proves that the displayed statistics are incorrect, which is >>>easy to do when only a few games are involved. >>> >>>This error always displays more games than actually exist in the database. >>>Exactly where the "extra" games come from is a mystery, because they do not >>>exist in the database. It could be some kind of tree-building error when the >>>games are imported. Or it could be an error in how the data is displayed (more >>>of a GUI-based error). I don't know which. >>> >>>I also don't know if the error relates to the Length designated when importing >>>games. That dialog box which appears asks you to select an Absolute length or an >>>ECO-relative length of 1-100. This determines how big the tree is. The larger >>>the Length selected, then the bigger (more MB or GB) the resulting opening book. >>>I usually choose a Length in the range of 15-35, depending on the size of the >>>database. >>> >>>I also don't know if the error relates to the size of the database involved. >>>Most of my databases are large, the largest over 3,000,000 games. Or perhaps >>>there is some limit on the size of the resulting book files, but I don't know. >>> >>>At this point, I also don't know exactly how bad the error is. If it's only a >>>handful of games, it wouldn't matter very much when the sample of games in a >>>given position is large. >>> >>>The uncertainties involved makes the resulting statistics practically worthless, >>>pending further research. Too bad, because it could be quite valuable if >>>accurate based on the database involved. >>> >>>At this point, I'm not aware of any way to avoid this bug. Does anyone know how >>>it can be avoided? >>> >>>My apology for saying "I don't know" so often, but alas there is no detailed >>>documentation or instructions for these operations insofar as I'm aware. You >>>just have to discover them on your own, then try to get lucky, I guess. Alas >>>that is typical of Chessbase documentation. They assume you already know >>>everything. >>>WP >> >>I have detected similar puzzles in the past. >> >>Some thoughts (ideas only): >> >>1. Transpositions may mean the position arrived at after making the displayed >>tree move was reached from different prior positions. >> >>Thus a search from the current position may only show one instance. >> >>2. Some transpositions may be included only in the notes within a game, but not >>with a separately saved database game entry. >> >>3. Reversed color openings may be included in the statistics, but a search may >>not find them, even if "equivalent". >> >>I suppose a test, creating a small database and corresponding tree, *may* show >>something, but then the problem may be hard to replicate. >> >>Let us know if you learn anything more. >> >>Regards, >>--Steve > >Thanks for your thoughts. None of my databases include any annotations, only raw >game scores, which rules out that possible source of error at least. Here is a >very simple example of how the Book Pane appears for a particular opening book >and position, which I happen to be studying at the present time: > >The Book Pane says there are a total of 2 games in the underlying database for >the current chessboard position. That is correct. > >It also shows there is 1 game in the database after the move 9...a6 is made. >That is also correct. > >Now the error... >It says there are 5 games in the database after the move 9...d6 is made. That is >divided into 2 white wins, 1 draw, and 2 black wins. That is very incorrect! In >fact, there is only 1 game in the database after 9...d6 (a draw). Where could it >possibly find the 4 extra games, which do not exist in the database from which >the opening book was prepared? >WP Did you have book learning on? Games played with book learning on will adjust the stats.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.