[Tutor] Cleaning up output

bjames at Jamesgang.dyndns.org bjames at Jamesgang.dyndns.org
Wed Jul 3 21:51:57 CEST 2013


I've written my first program to take a given directory and look in all
directories below it for duplicate files (duplicate being defined as
having the same MD5 hash, which I know isn't a perfect solution, but for
what I'm doing is good enough)

My problem now is that my output file is a rather confusing jumble of
paths and I'm not sure the best way to make it more user readable.  My gut
reaction would be to go through and list by first directory, but is there
a logical way to do it so that all the groupings that have files in the
same two directories would be grouped together?

So I'm thinking I'd have:
First File Dir /some/directory/
Duplicate directories:
some/other/directory/
   Original file 1 , dupicate file 1
   Original file 2, duplicate file 2
some/third directory/
   original file 3, duplicate file 3

and so forth, where the Original file would be the file name in the First
files so that all the ones are the same there.

I fear I'm not explaining this well but I'm hoping someone can either ask
questions to help get out of my head what I'm trying to do or can decipher
this enough to help me.

Here's a git repo of my code if it helps:
https://github.com/CyberCowboy/FindDuplicates



More information about the Tutor mailing list