Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Is Mega Database in danger of becoming FatBase?

Author: Mike Hood

Date: 10:10:49 01/13/05

Go up one level in this thread


On January 13, 2005 at 10:46:39, Louis Fagliano wrote:

>The number of games each year in ChessBase’s “flagship database” (their term)
>keeps whizzing rapidly upwards:
>
>Mega Database 1999   1.1 million games
>Mega Database 2000   1.4 million games
>Mega Database 2001   1.7 million games
>Mega Database 2002   2.0 million games
>Mega Database 2003   2.3 million games
>Mega Database 2004   2.6 million games
>Mega Database 2005   2.9 million games
>
>It’s just about 300,000 games per year.  Yet if you were to collect all of the
>new games compiled by Mark Crowler in TWIC for one year you would end up with
>about 75,000 to 80,000 new games for that calendar year.  Where are the extra
>games coming from?
>
>To me it doesn’t look like they’re coming from any good sources.
>
>Case in point:  Take the classic beginner’s opening 1. e4 e5 2. Qh5.  Now I
>would expect that in a quality or “flagship database”, there shouldn’t be any
>more than 5 or 6 games with that silly opening by White.
>
>I did a search to find out how many games in Mega Database 2005 started out with
>1. e4 e5 2. Qh5 and was shocked to find out there are 258 games!!  Even worse,
>White actually wins 94 of those games!
>
>Want more?  Well after 1. e4 e5 2. Qh5 there are a flabbergasting 80 games, yes
>count ‘em 80, where Black replies 2... Nf6?? and loses a pawn instantly to 3.
>Qxe5+.
>
>Is Mega Database in danger of becoming FatBase?  At least in the FatBase product
>they are honest enough to tell you that the games include a lot of garbage.
>Just because all the headers and names are consistent doesn’t mean quality if
>you have hundreds of games that start out with 1. e4 e5 2. Qh5.
>
>Even worse, in their search for more games regardless of how awful, they are
>still leaving out some quality games.  In a few opening treatise’s there is
>occasionally a reference to a game that I cannot find in Mega Database.

Your comments are all valid, Louis, but I have to ask: what would you do better?
If I were putting together a database I would go along lines like:

1. Include all the world championship games.

2. Include all the oldest recorded games, good or bad, for historical purposes.

3. Make a list of players (the exclusions could be controversial!) and include
all recorded games by those players.

4. Include all the games played at tournaments where the participants have an
average Elo greater than 2300 (maybe 2400).

5. Exclude all speciality games (such as blind games, Shufflechess, etc).

Have I forgotten any other important criteria? A few weak games might slip in,
especially the earliest games by the players listed under 3, but a database
created along these lines should be a useful study tool.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.