using python to parse md5sum list
jstroud at mbi.ucla.edu
Sun Mar 6 05:19:57 CET 2005
Among many other things:
First, you might want to look at os.path.walk()
Second, look at the string data type.
Third, get the Python essential reference.
Also, Programming Python (O'Riely) actually has a lot in it about stuff like
this. Its a tedious read, but in the end will help a lot for administrative
stuff like you are doing here.
So, with the understanding that you will look at these references, I will
foolishly save you a little time...
If you are using md5sum, tou can grab the md5 and the filename like such:
myfile = open(filename)
md5sums = 
for aline in myfile.readlines():
The md5 sum will be in the 0 element of each tuple in the md5sums list, and
the path to the file will be in the 1 element.
On Saturday 05 March 2005 07:54 pm, Ben Rf wrote:
> I'm new to programming and i'd like to write a program that will parse
> a list produced by md5summer and give me a report in a text file on
> which md5 sums appear more than once and where they are located.
> the end end goal is to have a way of finding duplicate files that are
> scattered across a lan of 4 windows computers.
> I've dabbled with different languages over the years and i think
> python is a good language for this but i have had a lot of trouble
> sifting through manual and tutorials finding out with commands i need
> and their syntax.
> Can someone please help me?
James Stroud, Ph.D.
UCLA-DOE Institute for Genomics and Proteomics
Los Angeles, CA 90095
More information about the Python-list