[code-quality] copy/paste detection tool
Lionel Barret
lionel.barret at lbdn-consulting.com
Wed Jul 17 16:59:50 CEST 2013
Hi Sylvain,
Sorry for the late answer, I just moved from Europython, to a week full of
meetings to holidays (almost) off the grid.
I didn't follow pylint progress but 2 years ago clonedigger was the clear
winner.
But you're right I need to see if having this other tool is worth pursuing.
I'll test again, in one or two weeks to see how they compare now.
It's only a single data point but it is a beginning
L.
On Fri, Jul 5, 2013 at 10:55 AM, Sylvain Thénault <
sylvain.thenault at logilab.fr> wrote:
> Hello Lionel,
>
> On 05 juillet 07:27, Lionel Barret wrote:
> > My name is Lionel Barret, I attended Florent Xicluna”s Europython talk
> > Tuesday and it reminded me of a clone detection tool I used in the past
> (on
> > a 100k sloc codebase)
> >
> > I talked about it with a few people (Florent Xicluna , Joe Gordon) and
> they
> > were interested. Florent told me it was the list for this kind of
> > discussion.
> >
> > This tool named clonedigger (http://clonedigger.sourceforge.net/ )
> detects
> > copy/pasted code or independent writing of the same classes/functions
> > across a big codebase. In my last job, I used to get a daily html
> report, a
> > big overview of the things that have been copy/pasted/rewritten. it was
> > really useful.
> >
> > Sadly, it is unmaintained, the last upload dates from 2011. Besides, it”s
> > using old packages (like the compiler package) and likely incompatible
> with
> > python3 (either for running or for analyzing).
> >
> > I really think this kind of tool should be part of any code-quality
> > toolbox, like pyflakes, pep8, etc.
> >
> > ( The tool itself is GPL, so no blocking there. ).
> >
> > I just wanted to see if anybody would be interested by an updated version
> > of the tool and who could help. From the top of my mind, the next steps
> > would be contacting the original author, evaluate the work to do
> (obsolete
> > modules used and python3 incompatibilities) and eventually refactor the
> > code.
>
> How does it compare to Pylint's similarity checker? Basically it will
> reports
> you copy/pasted/rewritten code implying more than a configurable number of
> lines, after some normalisation.
>
> --
> Sylvain Thénault, LOGILAB, Paris (01.45.32.03.12) - Toulouse
> (05.62.17.16.42)
> Formations Python, Debian, Méth. Agiles: http://www.logilab.fr/formations
> Développement logiciel sur mesure: http://www.logilab.fr/services
> CubicWeb, the semantic web framework: http://www.cubicweb.org
>
--
Cordialement,
Lionel Barret,
LBdN Consulting
--------------
http://www.lbdn-consulting.com
---
LinkedIn Profile : http://www.linkedin.com/in/lionelbarretdenazaris
Viadeo : http://fr.viadeo.com/fr/profile/lionel.barretdenazaris
---
Membre de l'Arsenal Numérique <http://arsenal-numerique.org/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/code-quality/attachments/20130717/23b29249/attachment-0001.html>
More information about the code-quality
mailing list