
Patch #935454 is a module implementing SHA-256, a variant of the 160-bit SHA algorithm supported in Python's existing sha module. Though it's more recent than the original SHA-160, SHA-256 is just as "standard"; both algorithms are specified by a NIST document. The submitter comments: The difference is that it produces a 256 bit hash value, instead of a 160 bit hash value. SHA-256 thus has 128 bits of resistance against birthday attacks, which makes it secure in certain protocols where SHA-1 is questionable (e.g. digital signatures; or RNGs or Key-Derivation Functions where you want to produce keys for 256-bit ciphers). A quick skim over the code doesn't turn up any issues, and the patch includes a test suite but no documentation. I don't want to do a detailed code review or require docs from the submitter if the module isn't likely to be included, so do we want to add this module? There are a bunch of other variants with different bit sizes such as 512, 384, and 224 bits. The only one likely to matter is SHA-512, so adding sha256 might mean that down the road we need to add a sha512 module, too, but that seems unlikely. --amk

On Tue, Jun 29, 2004 at 02:40:21PM -0400, Barry Warsaw wrote:
I'd sure hope so. Personally i prefer to not use a bits argument as it passing a function reference to a hash algorithm needlessly require a lambda. Instead do as someone else suggested and make a sha.sha256() function: do_thing_and_hash(things=thingList, hashmaker=lambda x: sha.new(x, bits=256)) vs do_thing_and_hash(things=thingList, hashmaker=sha.sha256) The only official SHAs defined are sha1, sha256, sha384 and sha512 and are typically referred to as a single unit name of "sha1" or "sha512" not "the SHA which is 512 bits." using a simple function name that is the common spoken name is nicer. Perl uses a top level module to contain all its hash functions (Digest::MD5, Digest::SHA1, etc). I agree that we should do that same (as someone else already suggested). Realistically, lets not reinvent the wheel. See the pycrypto module: http://www.amk.ca/python/code/crypto.html MD5 and SHA1 are the most common types of hashes in use today; those make sense to have in the base python distro. Does sha256 or greater? If someone needs something better than sha1 there is a good chance that they are also dealing with symmetric encryption or public key authentication and would need a module like pycrypto anyways. -g

At 12:04 PM 6/29/2004 -0700, Gregory P. Smith wrote:
[...] The only official SHAs defined are sha1, sha256, sha384 and sha512
Plus SHA-224, which is just SHA-256 with a different initial value and truncated output (similarly, SHA-384 is just SHA-512 with a different initial value and truncated output). (Why so many SHAs? Due to birthday attacks, sometimes hash algorithms have half the bit-security of their output length, and you might want your hash's security level to match your cipher's security level. AES has a security level (i.e. key length) of 128, 192, or 256 bits. 3DES has a level of 112 bits. SHA-1 was designed as part of a suite including the Skipjack 80-bit cipher).
Agreed, since SHA-1, SHA-256, and SHA-512 are different algorithms that are just named similarly. That's less true of SHA-224 and SHA-384, which *are* just parameterizations of the other algorithms, and could be done like: sha256.new('string', bits=224) sha512.new('string', bits=384)
Yeah, that also gives room to grow if other hashes become prevalent.
I think SHA-256 does, since SHA-1 is skimpy for a lot of uses. They're aren't many SHA-256-using protocols due to inertia, but I think it's happening. The others I don't see any rush for.
Good point. Probably this and other crypto patches need to viewed in light of a broader "crypto strategy", whatever that may be. My thought is that since almost all crypto protocols depend on a tiny number of primitives (a few ciphers, a few hashes, modular exponentiation, random numbers), it would be good to have these in stdlib. Otherwise crypto-using apps require extensions (like pycrypto + GMP) which makes them hard to distribute. It would be great to borrow code from pycrypto where possible (for example, pycrypto has excellent ciphers, though it doesn't have SHA-256). But these things would be handiest if they came with the standard library. Anyways, I advocated this below.. I'd be happy to write this up as a PEP or something, if that would be easier to consider than a scattershot set of patches? http://mail.python.org/pipermail/python-dev/2004-May/044673.html Trevor

Trevor Perrin <trevp@trevp.net> writes:
I think SHA-256 does, since SHA-1 is skimpy for a lot of uses.
Nevertheless, am I right to still believe that there are no known distinct strings which even MD5 to the same hash?
Unfortunately, distributing crypto software is still a hideous international mess (just because the *US* is less silly these days...). Cheers, mwh -- GAG: I think this is perfectly normal behaviour for a Vogon. ... VOGON: That is exactly what you always say. GAG: Well, I think that is probably perfectly normal behaviour for a psychiatrist. -- The Hitch-Hikers Guide to the Galaxy, Episode 9

On Wed, Jun 30, 2004 at 10:15:24AM +0100, Michael Hudson wrote:
Nevertheless, am I right to still believe that there are no known distinct strings which even MD5 to the same hash?
Correct. One significant reason for the larger SHAs to generate 256-bit keys for AES encryption; it's better to have a larger hash than to take a smaller one and replicate portions of it. But, given that we're not going to include AES in the Python stdlib, people will have to download a separate library anyway. This library could include SHA256, so this application isn't a compelling reason to add SHA256 to the stdlib. It would be different if there were existing protocols that need the larger hash, such as HTTP digest auth; are there any? --amk

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 30/06/2004, at 5:30 PM, A.M. Kuchling wrote:
People use sha as a one way hash for storing passwords and credit card details. The extra bits will make the paranoid a bit happier (and the less paranoid are still using md5). - -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (Darwin) iD8DBQFA4uAvAfqZj7rGN0oRAoejAJ4wpOQx2f3+yvSFqgKPW+S3h9YmVQCggy5h 7Ydvtp/Rdj19KT/sFOtxXM8= =/Fa4 -----END PGP SIGNATURE-----

At 10:15 AM 6/30/2004 +0100, Michael Hudson wrote:
Yes, though there's a distributed computing project looking for a collision, and they expect to succeed in a couple years. http://www.md5crk.com/
Things have been liberalizing rapidly. I'm not sure how true that is anymore, though I don't have direct experience (aside from offering some crypto software on a website; people download it from all over the place, but maybe they're scofflaws, who knows). I know US export is no problem. According to [1], most countries have no laws restricting imports, with the notable exception of ex-USSR countries and China, which require licenses. I've heard anecdotally the Russian requirements are mostly ignored [2]. I don't know about China. More anecdotal evidence: The windows python installer includes strong crypto (SSL). Has that caused problems? Regardless, we could offer a no-crypto distribution. It would be interesting to see how many people download it. If not many, then it could be abandoned.... A.M. Kuchling wrote:
There's protocols that can use SHA-256, like SSH, S/MIME, or PGP, but these all require other crypto primitives, so your point stands. And I agree: crypto primitives should probably be considered as a lump. If ciphers are absolutely not going to get in, putting in other crypto stuff is not that helpful.. Trevor [1] http://rechten.uvt.nl/koops/cryptolaw/cls-sum.htm [2] http://www.privacy.nb.ca/cryptography/archives/cryptography/html/1997-03/002...

Agreed. Python already includes crypto and US export is nothing more than a harmless "let US Dept of Whateveritscalledtoday know that X has crypto in it." the bsddb module includes encrypted database support in it (unless the windows packager has been building the non crypto version of the library distributed by sleepycat; i haven't checked). The point about SSL being included is interesting. The OpenSSL library provides implementations of all of the important hash algorithms (and uses them in order to implement ssl!). Its hashing code is much better optimized on various architectures than the python module ever will be. I just filed feature request 983069 to keep this on the radar.
To waffle on my earlier question of "what uses sha256 w/o also needing crypto?"... One reason I can see for adding sha-256 and sha-512 (and 224/384 wrappers) to standard python is that they will potentially be used in future distributed data storage and p2p protocols for large data set integrity checking. -g

[Gregory P. Smith]
The Windows installer does ship the non-crypto version of Sleepcat Berkeley DB. There was no debate about that, it was just the easiest thing to do at the time. For the same "reason", the Windows installer doesn't ship any of the auxiliary BDB tools either ... it's the laziest packaging of bsddb that allowed the test suite to pass. Improving this story.would probably require a volunteer who actually knows something about BDB.

At 03:52 PM 6/30/2004 -0700, Gregory P. Smith wrote: [...]
On my P4, OpenSSL SHA-1 looks around 25% faster (75 vs. 60 MB/s). FWIW, I've changed the patch to support SHA224, 384, and 512. There are "sha256" and "sha512" modules, with an extra function in each module for the truncated algorithm:
http://sourceforge.net/tracker/index.php?func=detail&aid=935454&group_id=5470&atid=305470 Since there are some module-level functions and constants (new(), digestsize, blocksize), I like using separate modules instead of sticking everything in 'sha'. We could also add some simple wrapper modules for sha224 and sha384 to make them appear as top-level modules, like the other ones. Trevor

Trevor Perrin <trevp@trevp.net> writes:
Interesting, thanks. The concept of Python being 'imported' or 'exported' still strikes me as meaningless, but I'm not going to worry about it. Cheers, mwh -- Many of the posts you see on Usenet are actually from moths. You can tell which posters they are by their attraction to the flames. -- Internet Oracularity #1279-06

At 10:24 AM 6/29/2004 -0400, A.M. Kuchling wrote:
Sorry about the lack of docs. They'll be easy to copy-and-modify from the "sha" module, I'll try to get to that in a day or two.
I agree that SHA-512 is less important: it's much slower, and the security margin vs. SHA-256 is excessive. However, a "hashes" package might make sense: from hashes import md5 from hashes import sha from hashes import sha256 ... At some future date, you could imagine a "ciphers" package with similar structure: from ciphers import AES from ciphers import DES3 ... Trevor

"A.M. Kuchling" <amk@amk.ca> writes:
Patch #935454 is a module implementing SHA-256, a variant of the 160-bit SHA algorithm supported in Python's existing sha module.
Why a new module and not just a new function 'sha.sha256'? Cheers, mwh -- MAN: How can I tell that the past isn't a fiction designed to account for the discrepancy between my immediate physical sensations and my state of mind? -- The Hitch-Hikers Guide to the Galaxy, Episode 12
participants (9)
-
A.M. Kuchling
-
Armin Rigo
-
Barry Warsaw
-
Gregory P. Smith
-
Michael Hudson
-
Raymond Hettinger
-
Stuart Bishop
-
Tim Peters
-
Trevor Perrin