Computer Chess Club Archives

Search

Terms
Messages

Subject: Re: PGN Parser avaible

Author: Mathieu Pagé
Date: 11:29:52 07/12/05
On July 12, 2005 at 13:50:55, Dann Corbit wrote:

>Any chance I can get the formal grammar too?

The grammar is written using the spirit library from boost
(http://boost.org/libs/spirit/index.html). This is the first time I used it. It
is also the first time I use a parser grammar tool. I choosed it over lexx/byacc
and the others because it seemed easier to port since it only require to include
some headers files in the project. I was also confident in the quality and
portability of a library accepted as a part of boost.

The grammar is placed in CPGNParser.cpp in the default constructor on CPGNParser
: CPGNParser::CPGNParser(). I also copied it at the botom of this message.

As I said it was the first time I used spirit, so the code will probably need
some revision to clean it and optimize it. However I plan not to change the (yet
limited) public interface. There will probably have some addition but the next
releases should be backward compatible.

Mathieu Pagé
mathieu.page@gmail.com

<code>
   // match a letter between a and h
   rlColumnChar = range_p('a','h');
   // match a digit between 1 and 8
   rlLineChar = range_p('1','8');

   // mathch a square id
   rlSquare = rlColumnChar >> rlLineChar;

   // match the character 'x'
   rlTakeChar = ch_p('x');
   // match the character '+'
   rlCheckChar = ch_p('+');
   // match the character '#'
   rlMateChar = ch_p('#');

   // Match the string "O-O" optionaly followed by "-O"
   rlCastle = str_p("O-O") >> !str_p("-O");

   // Match the character '=' followed by one of thoses : 'R', 'N', 'B' or 'Q'
   rlPromotion = ch_p('=') >> (ch_p('R') | ch_p('N') | ch_p('B') | ch_p('Q'));

   // Match one of thoses characters : 'P', 'R', 'N', 'B', 'Q' or 'K'
   rlPieceChar = ch_p('P') | ch_p('R') | ch_p('N') | ch_p('B') | ch_p('Q') |
ch_p('K');

   // Match a pawn move as described in the PGN standard
   rlPawnMove =  !(rlColumnChar >> rlTakeChar) >> rlSquare >> !rlPromotion;
   // Match a non-pawn move as described in the PGN standard
   rlNonPawnMove = rlPieceChar >> ((!rlColumnChar >> !rlLineChar >> !rlTakeChar
>> rlSquare) | rlSquare);

   // Match an annotation as defined in the SAN standard
   rlAnnotation = str_p("!!")[bind(&CPGNParser::OnNag, this, 3)] |
                  str_p("!?")[bind(&CPGNParser::OnNag, this, 5)] |
                  str_p("?!")[bind(&CPGNParser::OnNag, this, 6)] |
                  str_p("??")[bind(&CPGNParser::OnNag, this, 4)] |
                  ch_p('!')[bind(&CPGNParser::OnNag, this, 1)] |
                  ch_p('?')[bind(&CPGNParser::OnNag, this, 2)];

   // Match a SAN move
   rlSANMove = ((rlPawnMove | rlNonPawnMove | rlCastle) >> !(rlMateChar |
rlCheckChar))[bind(&CPGNParser::Move, this, _1, _2)];

   // Match a numerical annotatation glyph as defined by the PGN standard
   rlNAG = ch_p('$') >> uint_p[bind(&CPGNParser::OnNag, this, _1)];

   // Match a tag name as defined in PGN standard
   rlTagName = (+(alnum_p | ch_p('_')));
   // Match a tag value string (string token) as defined in the PGN standard
   rlStringToken = confix_p('\"', (*(anychar_p - ch_p('\t') -
ch_p('\n')))[bind(&CPGNParser::TagValue, this, _1, _2)], '\"');

   // Match a tag as defined in the PGN standard
   rlTag = ch_p('[') >> *space_p >> rlTagName[bind(&CPGNParser::TagName, this,
_1, _2)] >> *space_p >> rlStringToken >> *space_p >> ch_p(']');

   // Match a tag section (one or more tags) as define in the PGN standard
   rlTagSection = (*space_p) >> +(rlTag >> *space_p);

   // Match the game terminator as defined in the PGN standard.
   rlGameTerminator = (str_p("1-0") | str_p("0-1") |
str_p("1/2-1/2"))[bind(&CPGNParser::EndGame, this, _1, _2)];

   // Match a move number (a move followed by a point) as defined in the PGN
standard
   rlMoveNumber = uint_p[bind(&CPGNParser::MoveNumber, this, _1)] >> ch_p('.')
>> !str_p("..")[bind(&CPGNParser::MoveThreePoints, this)];

   // Match an "end of line" comment or a "between braces" comment.
   rlComment = confix_p(ch_p(';'), (*anychar_p)[bind(&CPGNParser::Comment, this,
_1, _2)], eol_p) |
               confix_p(ch_p('{'), (*anychar_p)[bind(&CPGNParser::Comment, this,
_1, _2)], ch_p('}'));

   // Match a variation. A variation is a MoveText inserted into an other one
   // within parenthesis.
   rlVariation = ch_p('(')[bind(&CPGNParser::BeginVariation, this)] >>
rlMoveText >> ch_p(')')[bind(&CPGNParser::Endvariation, this)];

   // Match the move text section of a PGN game
   rlMoveText = *space_p >> +((rlMoveNumber | (rlSANMove >> !rlAnnotation) |
rlComment | rlVariation | rlNAG) >> *space_p);

   // Match a PGN game
   rlPgnGame = rlTagSection >> rlMoveText >> rlGameTerminator >> *space_p;

   // Match a succession of PGN games separated by at least one line
   rlPgnCollection = *(*space_p >> rlPgnGame) >> *space_p;
</code>
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.