CRC-module
Michael Hudson
mwh21 at cam.ac.uk
Wed Nov 24 10:40:31 EST 1999
Thomas Weholt <thomas at bibsyst.no> writes:
> Hi,
>
> Ok, so I`ve looked into zlib.crc32 and zlib.adler32. They seem easy
> enough to use, but I thought crc-codes had characters and numbers in
> them, not just a plain integer like the methods above return. ( As you
> can see, I`m a complete ass on this subject, but don`t have time to do
> the proper research myself, and was hoping for a "quick fix" ... )
Well, on one level there's not much difference between a binary string
and an integer. But crc32 returns a 32-bit value, so it's most
convenient/efficient to store it in an integer.
> A friend of mine mentioned that I should try SHA-1 instead, for more
> accuracy. Can anybody give me an example on how to compute crc-codes,
> using zlib or preferrably some more accurate method, for single files ??
Well, comparing crc32 and SHA-1 or md5 isn't really comparing like
with like, to the (small) extent of my knowledge on the matter; crc32
(AFAIK) is designed to spot accidental transmission errors, sha-1/md5
are (certainly) designed to spot malicious modification.
Also md5 is 128 bits and sha-1 is 160, so obviously these are finer
grained than crc32.
> If this is all it takes :
>
> crc = module_name.crc_method(file)
>
> and comparison is done like :
>
> if (crc1 == crc2): print "Equal."
> else : print "Different."
>
> then all I need is the name of the most effective/accurate module to
> use.
>
> If, for some strange reason, I should use one module instead of another,
> that info would be interesting too.
For that application, it'd probably be best to use sha, eg:
import sha
sha.sha(open(filename).read()).digest()
Efficiency-wise that'll be IO bound so the fact that SHA-1 is more
CPU-intensive (I think) than crc32 shouldn't be relavent.
HTH,
Michael
More information about the Python-list
mailing list