Computer Chess Club Archives


Search

Terms

Messages

Subject: Re: A fix for the clone detection problem

Author: Robert Hyatt

Date: 07:55:11 12/01/03

Go up one level in this thread


On December 01, 2003 at 01:03:10, Steven Edwards wrote:

>The recent fiasco regarding a suspected clone has shown that the process used,
>an anonymous accusation followed by a coercive source demand, is an unacceptably
>poor method for handling potential source plagiarism.
>
>The clear need here is for a method that does not depend on subjective human
>evaluation of similarity of play or upon the random accusation of a non-biased
>party.  My proposal is instead to use a test suite to provide a performance
>fingerprint of all the entrants in a competition.
>
>This fingerprint is constucted by running the same EPD test suite for each
>program immediately prior to the start of an event and then automatically
>checking the resulting EPD output files with some predetermined similarity
>metric.  The same suite can be fed to non-competing programs if necessary.  The
>similarity test would look at both the PV and the evaluation scores of each
>record generated and this should be enough for clone detection.
>
>The test suite has to be the same for each program, but it does not have to be
>the same suite for each event; neither does it have to be disclosed beforehand.
>It would be best to automatically generate the suite by taking a hundred or so
>high level decisive game scores and selecting a near terminal position from each
>one.  The selected position would be for the winning side a few moves prior to
>the end of the game.
>

I don't see how this would solve the problem.  Two good programs _ought_ to
reach the same decisions on a set of problems, with high confidence.  Just
because I happen to match the moves chosen by program X makes me a clone?

I don't think it is that easy.  That might be a good first (coarse)
approximation, however.  But certainly not "the smoking gun".  Not even
a "unloaded gun".



>Advantages:
>
>1. Does not depend on random accusations.
>
>2. Source code is kept private.
>
>3. Equal application for all entrants.
>
>4. No subjectivity, except for deciding the cutoff point for too much
>similarity.
>
>5. Mostly automated process.
>
>6. Done prior to the event, so no surprises during the event.
>
>7. Should discourage cloners from entering an undisclosed clone in the first
>place.
>
>Disadvantages:
>
>1. Requires an hour or so of computing for each program per event.
>
>2. Someone has to write the similarity metric generator.



This page took 0 seconds to execute

Last modified: Thu, 15 Apr 21 08:11:13 -0700

Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.