signature for a file ?

Huaiyu Zhu huaiyu at gauss.almadan.ibm.com
Tue Jul 30 18:19:58 EDT 2002


Shagshag13 <shagshag13 at yahoo.fr> wrote:
>
>i had at home many hdds, that could contain many time same files, in many
>places/directories (-> i'm really disorganized).  i would like to do some
>sort on theses files. to do this i'm planning to write a python script that
>would compute a kind of CRC32, MD5 or SHA (i'm really not competent in that
>- so here i need advices and pointer to some implementations - and to know
>which is the best to had a unique unambiguous signature for a file) and
>then use it to find "doubles" : same size + same signature = probably same
>file.

That would be very useful indeed.  (Concurs another disorganized person :-)

Here's a further question.  Once you get to know the identities of files,
how do you know about the directories?  I have many directories that have
identical subdirectories.  I'd like to build an inventory of maximal
identical directories.  A and B are defined as maximal identical if they are
identical but their parents are not.  The few ideas I have all produces
combinatorial explosion.

Huaiyu



More information about the Python-list mailing list