Why checksum? [was Re: Fuzzy Lookups]
Steven D'Aprano
steve at REMOVETHIScyber.com.au
Tue Jan 31 16:28:17 EST 2006
On Tue, 31 Jan 2006 10:51:44 -0500, Gregory Piñero wrote:
> http://www.blendedtechnologies.com/removing-duplicate-mp3s-with-python-a-naive-yet-fuzzy-approach/60
>
> If anyone would be kind enough to improve it I'd love to have these
> features but I'm swamped this week!
>
> - MD5 checking for find exact matches regardless of name
> - Put each set of duplicates in its own subfolder.
This isn't a criticism, it is a genuine question. Why do people compare
local files with MD5 instead of doing a byte-to-byte compare? Is it purely
a caching thing (once you have the checksum, you don't need to read the
file again)? Are there any other reasons?
--
Steven.
More information about the Python-list
mailing list