Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Clean up PGN

Author: Robert Hyatt

Date: 06:14:51 12/29/99

Go up one level in this thread


On December 28, 1999 at 16:25:37, Pete Galati wrote:

>On December 28, 1999 at 15:50:33, Alvaro Rodriguez wrote:
>
>>Hi, I wonder if there is a program that can "clean up" PGN and produce games
>>that are PGN compliant? I used the chessmaster 6000 database and I make a large
>>PGN file, like 300 mb . But when I create the book.bin and books.bin in crafty,
>>I get error messages, like this: ERROR!   Move 20: dxe6ep is ellegal (line
>>xxxxxxx). Thats why I wonder if such a program exists..
>>
>>Best wishes
>>
>>Alvaro
>
>I used to copy pgn databases into new cbf databases with the Extreme Chess
>program, and then copy them from there back into a pgn database and that cleabed
>up an awfull lot of sloppy pgn files.  You might be able to do something like
>that with Chessbase Lite using cbh files as a transfering format, hard to say,
>might work.  But you can forget about running a file as large as 300mb through
>it, I think the limit might be 8000 games before it gets brain freeze.
>
>In the example you give above, I don't think Crafty liked seeing "ep" (a guess)
>because I don't really thing that's standard pgn notation.  You can probably use
>a monster strength editor like Emacs from GNU to replace things like "ep" with
>nothing, and then Crafty might be more willing to parse those moves.
>
>Pete

The common PGN mistakes that many commercial programs are producing when they
make PGN files are the following:

1.  Using zero-zero for castling rather than oh-oh (the alphabetic character O
is correct, not the number zero).

2.  adding some nonsensical string like EP, ep, e.p. or enpassant after an
enpassant capture.  It is not needed, nor allowed by the PGN/SAN standards.

3.  promotion is supposed to be d8=Q.  Not d8Q.

4.  Moves should be in SAN, as e4, Nf3, Ngf3, Bxg6, etc.  not things like
e2-e4 and so forth.

5.  Move numbers are supposed to be followed by a period, followed by a
space.  1. e4 e5 2. Nf3 is correct.  Not 1.e4 e5 etc.

6.  Bogus headers and missing (required) headers.  Bogus headers include
bad dates, missing quote marks, etc...

7.  Bogus comments using () and {}.  They are supposed to be matched.  Often
they are not.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.