[code-quality] copy/paste detection tool

Lionel Barret lionel.barret at lbdn-consulting.com
Fri Jul 5 07:27:16 CEST 2013


Hi,


My name is Lionel Barret, I attended Florent Xicluna”s Europython talk
Tuesday and it reminded me of a clone detection tool I used in the past (on
a 100k sloc codebase)

I talked about it with a few people (Florent Xicluna , Joe Gordon) and they
were interested. Florent told me it was the list for this kind of
discussion.


This tool named clonedigger (http://clonedigger.sourceforge.net/ ) detects
copy/pasted code or independent writing of the same classes/functions
across a big codebase. In my last job, I used to get a daily html report, a
big overview of the things that have been copy/pasted/rewritten. it was
really useful.


Sadly, it is unmaintained, the last upload dates from 2011. Besides, it”s
using old packages (like the compiler package) and likely incompatible with
python3 (either for running or for analyzing).

I really think this kind of tool should be part of any code-quality
toolbox, like pyflakes, pep8, etc.

( The tool itself is GPL, so no blocking there. ).


I just wanted to see if anybody would be interested by an updated version
of the tool and who could help. From the top of my mind, the next steps
would be contacting the original author, evaluate the work to do (obsolete
modules used and python3 incompatibilities) and eventually refactor the
code.

So what do you think ?


Best,

L.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/code-quality/attachments/20130705/c7317126/attachment-0001.html>


More information about the code-quality mailing list