Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: Parsing enormous.pgn

Author: Robert Hyatt

Date: 07:00:35 04/08/05

Go up one level in this thread


On April 08, 2005 at 00:23:15, Tor Alexander Lattimore wrote:

>Hi
>First, is it alright to use enormous.pgn for a book? Secondly, i've been trying
>to parse it recently and my program seems to be doing fine until about 300,000
>games where it just returns EOF. I've tried opening and reading from other large
>files and get the same problem. Initially I tried using C++'s <iostream>
>library, but when that didn't work I tried standard C fopen() and fgetc() with
>no more success. The file is 900 MB, so shouldn't be a problem where windows
>does strange stuff with 2GB or > files.
>
>Below is some very cut down code that returns EOF long before the file is
>through.
>/*
>pgn_in is a string to the pgn file.
>this program will return EOF after about 360,000 games
>*/
>        int games=0;
>	string input;
>	ifstream pgn_in(pgn_file.c_str());
>	while (!pgn_in.eof())
>        {
>		pgn_in >> input;
>                if (input=="[Event") games++;
>        }
>	book_out.close();
>	pgn_in.close();
>        cout << games << "\n";
>	return true;
>
>Anyone have any ideas? The same code as above works just fine in my Gentoo Linux
>system, but not Windows XP home.
>cheers
>Tor

One idea... you need to recognize that some software produces PGN games that are
"broken".  For example, an opening { but no closing }.  You can catch this by
picking up on the next PGN header which will start with [ sometag ... ]

That sounds like what is happening.




This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.