Author: Andreas Stabel
Date: 02:31:15 10/19/00
Go up one level in this thread
On October 18, 2000 at 13:41:49, Richard A. Fowell wrote: >On October 18, 2000 at 10:51:25, Steve Maughan wrote: > >>I'm writing a PGN / EPD parser for my program. I've managed to find the PGN >>standard on the Internet and implemented most of it. >> >>I'm hoping to make my parser as robust as possible so it would be great if it >>could cope with the situation when other programs produce less than standard PGN >>output e.g. using o-o ('zero'-'zero') instead of O-O?. My question is does >>anyone know of a document that discussed how people have alterned the PGN >>standard? >> >>All help appreciated! >> >>Steve Maughan > >After one programmer complained that the difficulty with adding PGN input to >his program was that "nobody followed the standard", I surveyed PGN output >from the Net to try to catalog these deviations. I pulled PGN from every site >and every program I could find. > >I then emailed every offending author I could reach, figuring that: >a) The best way to clean up PGN on the Net was at the source >b) The greater the percentage of "good examples" of PGN compliance, the > less likely new authors would feel that "anything goes - I don't need > to follow the parts of the standard I find irksome". > >Sin #1 is the only one that was committed by more than 50% of >my samples. All others were less than 25% - I think #5 was the next most >common, though #4 might be significant, too. > >Note that this list can be useful both as a list of things to target for >your parser, and as a list of things to double-check in your output. >In further aid of that, I've appended the standard PGN test game provided >by the IECC as an example that exercises many of the rarer and more obscure >cases PGN must handle. Any PGN compliant program should be able to read this >game, and (PGN standard section 3.3.1) reproduce it exactly on output. > >PGN sins I've seen: (how does your program do on this list?) > 1) wraps movetext lines to less or more than the required 79 character maximum > 2) Omits required empty line after movetext (especially at end of file) > 3) Omits required empty line separating tags from movetext > 4) Tags not in required order (standard seven in std order, then alphabetic) > 5) Omits the space after the periods, e.g. "1.e4" instead of "1. e4". > 6) Inserts non-PGN lines, before/between PGN games in PGN file [TWIC does this] > 7) Fails to use the "#" to indicate checkmate (uses "+", "++", or no marker) > 8) Extra spaces added to make move columns line up prettily (ICC used to) > 9) Uses lower case "o" or zeros instead of upper case "O" in "O-O" and "O-O-O" >10) Ending token ( such as 1-0) is omitted, or conflicts with Result tag. >11) Use of "1/2" rather than "1/2-1/2" in Result tag or ending token. >12) Omits the periods in the movetext >13) Non-standard promotion indicators (such as b8Q or B8(Q) vs. b8=Q ) >14) Date tag in non-standard format ( 10/31/2000 or 10.31.2000 vs 2000.10.31) >15) "ep" or "e.p." added after en passant capture >16) Uses coordinate notation (e2e4 or e2-e4) rather than SAN >17) Uses long algebraic notation: like Ng1-f3, rather than SAN >18) Inserts a blank in front of # ( "21. Qa8 #" vs. "21. Qa8#") > >Standard PGN test game ("IMCorrect.PGN") >==================== <standard PGN game follows this line> ====== >[Event "PGN Examples"] >[Site "IECC"] >[Date "1997.04.15"] >[Round "?"] >[White "Brown, Mary"] >[Black "Green, John"] >[Result "1-0"] > >1. e4 e5 2. Nf3 Nc6 3. Bc4 b6 4. O-O Bb7 5. d4 Qf6 6. c3 O-O-O 7. Nbd2 exd4 8. >cxd4 Nge7 9. d5 Ne5 10. Qe2 N7g6 11. Ba6 Bd6 12. Nxe5 Qxe5 13. Bxb7+ Kxb7 14. >Nf3 Qh5 15. b3 c5 16. dxc6+ dxc6 17. Bb2 Rhe8 18. Rfc1 Bf4 19. Rc4 Rd2 20. Qxd2 >Bxd2 21. Nxd2 Nf4 22. e5 f5 23. exf6 Qg5 24. g3 Ne2+ 25. Kf1 Qb5 26. f7 Kc8 27. >fxe8=Q+ Kc7 28. Rd4 Nc1+ 29. Kg1 c5 30. Qe3 Nxb3 31. axb3 g5 32. Rda4 c4 33. >Ra6 Qa5 34. R6xa5 c3 35. Nc4 cxb2 36. Rd1 b1=N 37. Qe4 Nc3 38. Rxa7+ Kb8 39. >Qb7# 1-0 > >==================== <standard PGN game precedes this line> ===== This might be a very useful list for PGN implementators in the future, so perhaps we all should add to it. Here are my additions: 19) Wrong disambiguations - i.e. if two knights can move to the same square but one is pinned, disambiguation is not necessary. 20) Short version of moves like Rc6 in stead of Rxc6 or bc in stead of bxc6. 21) Illegal characters at the end of a move. (usually to tell if it was bad or good etc.) 22) Small letters for pieces - causing the ambiguity of Bxc6 and bxc6. Generally the case of any letters may be wrong. 23) Illegal strings in tags. F.ex. the string starts with " but does not end with it. The string has non-escaped " in the middle. Strings use ' etc. Or no " at all. 24) Illegal dates in the date tag. I think this must be one of the most common errors. 25) Illegal 'unknown' markers. I must admit that at this point I am a bit confused myself. Should it be * or ? or - or an empty string or ??? Most common places where this mistake occurs are in the date, result and round tags. At the end I want to say that I find the PGN standard very useful even if it is not followed strictly. I've been able to write a program that parses correctly all PGN files I've encountered, so the deviations are usually not serious. I agree that some of the standard perhaps is too strict, but I also think that almost all of the errors above should be avoided by any serious product. My pet irritations are wrong results (I mean results where the result tag differs from the result in the movetext and where the result is different from a mating last move or stalemate etc. Another one is illegal dates, which makes life difficult some times when some might write august 10. 2000 as 2000.08.10 and some as 2000.10.08 and one can't decide which is right. Best regards Andreas Stabel
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.