Author: Roger Brown
Date: 11:48:11 12/27/03
Go up one level in this thread
On December 27, 2003 at 14:27:18, José Carlos wrote: > Hi, is there any tool out there that removes duplicate games (ignoring the >header, I'm only interested in the moves) from a big PGN? I think I recall >someone saying that SCID can do that, but I can't find any menu option there to >do it. > Thanks in advance. > > José C. Hello Jose Carlos, Indeed Scid removes duplicates and the larger the pgn file the better. (1) Create a new Scid database. It will be empty until you dump the pgn file into it. Scid treates PGN as read only for now. Then follow the instructions in (2) below. There several other tools out there but I love Scid... (2) Deleting twin games The [File: Maintenance] menu has a command [Delete twin games...] for detecting extra copies (twins) of games in the database. This command finds all pairs of games that are twins and, for each pair, flags the shorter game deleted leaving the longer game undeleted. Two games are considered to be twins if their players (and any other tags that you can optionally specify) match exactly. If you specify the "same moves" option, each pair of games must have the same actual moves up to the length of the shorter game (or up to move 60, whichever comes first) to be twins. When you have deleted twins, it is a good idea to check that each game deleted really is a copy of another game. You can do this easily if you selected the "Set filter to all deleted games" option in the delete twins dialog box. The filter will now contain all deleted games. You can browse through them (using the p and n keys) with the twins checker window (available from the maintenance menu, or the shortcut key Ctrl+Shift+T) to verify that each game is deleted because it actually is a twin of another game. (3) Export the cleaned file to pgn Tools > Export to Pgn I hope this helps you. Later.
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.