Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: How to build a ChessBase database from a large PGN file..

Author: Dann Corbit

Date: 21:36:00 07/24/01

Go up one level in this thread


On July 24, 2001 at 23:06:58, John Hatcher wrote:

>On July 24, 2001 at 20:57:16, Dann Corbit wrote:
>
>>Not for me, it was asked in this message:
>>http://www.icdchess.com/forums/1/message.shtml?180953
>>
>>Since the header of that message is not descriptive of the actual problem, I
>>thought I would start a new thread so that the OP might find the answer.
>>
>>I'm pretty well ignornant when it comes to CB.
>
>In all seriousness, why would anyone want to build an opening book from 1.5
>million games?

That's a pipsqueak compared to some database files I know of.
I know of one collection with 7.1 million games between rated players.

>Surely, 1.3 million of the games would be between Joe Blow and
>Norm Nobody.  Who cares what they played in the opening?  I would be very
>surprised if all the recorded games between International Masters and
>Grandmasters totaled more than 300,000 games.

Prepare to be surprised.  I have 380K in my tiny (highly filtered) set of 2.5
million games.  I throw out any games with the same move sequence.  There are
lots of non-duplicate games that get clubbed from that.

I have about one million games between computer opponents.  Perhaps I want to
include those also.

>I extracted, from a database of 1.5 million games, all the games where both
>players have an actual, or historical (e.g., Capablanca), rating of 2500+
>There are only about 100,000 games in that book.
>
>I wouldn't want a book comprised of 1.5 million games.  There'd be a lot of
>chaff with the wheat.

Well, to each his own.  I wasn't asking for me, but (rather) for someone else.
Anyway, I think it's shortsighted to try to decide what is better for other
people.

Imagine (for instance) that they want to prepare for someone of ELO 1800 in
their database.  They might notice (for instance) that they lose 70% of the time
to the french defense.

In my case, I intend to (at some point) analyze every move that has ever been
played.  I estimate there are about one billion distinct positions in that
category.



This page took 0.01 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.