Author: Dieter Buerssner
Date: 12:05:52 10/12/05
Go up one level in this thread
On October 12, 2005 at 14:33:55, Dann Corbit wrote: >On October 12, 2005 at 12:38:09, Salvo Spitaleri wrote: > >>On October 12, 2005 at 12:31:03, Dann Corbit wrote: >> >>>On October 12, 2005 at 12:25:41, Salvo Spitaleri wrote: >>> >>>>Hello friends, >>>> >>>>How to split huge pgn files, in smaller files by name of the players? >>>> >>>>Can PGN-extract do it? >>> >>>Yes. >>> >>>So can SCID and many others too, I imagine. >> >>Hi Dann, >> >>I mean in a automatic way for all the players in the file. >>SCID can do it only for one player to the time! > >Probably PGN Extract is better. > >1. Grep for player names with the "[White " and "[Black " tags >2. Create a sorted unique list of players from the tags >3. Use PGN-Extract to filter into groups from the players list. > >It goes without saying that the games will have lots of duplicates when filtered >in this way (e.g. Fischer verses Karpov will show up in the Fischer file and in >the Karpov file). Dann, this certainly looks like excellent advice. But it seems to need quite some work between the steps, that almost needs a programmer. I did not try, but I guess the grep needs some escape for the "[". The result of the grep will need some trimming (getting rid of quotation, [], White, Black). This seems the most difficult part to me. I don't know PGN-Extract well enough, to judge how well it would work here. I'd fear, it would take many passes over the original PGN, that it could be just too slow on a large PGN (say your "junkbase" with something around 3 GB). It would produce probably really many files, not all file systems would be able to handle this. Another complication might be "special" (non English) letters inside the names (think of / or \). No doubt, all these problems will be solvable. Perhaps PGN-extract already handles the later mentioned issues well. However, it looks to me, as if it were no easy "end user task". I would probably try to use my PGN-parser and write it in C. I'd try to extract the two names of each game, open in append mode two files with trimmed names in append mode and write the game there. Then close the files again. When there is no file number limitation, this might work in an hour or so. No idea, if the large number of open/close would make the thing too slow for beeing useful for really large PGNs. Regards, Dieter
This page took 0.02 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.