Encryption source code with md5

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sun Apr 4 04:38:26 CEST 2010

On Sun, 04 Apr 2010 13:21:34 +1200, Lawrence D'Oliveiro wrote:

> In message <4baf3ac4$0$22903$e4fe514c at news.xs4all.nl>, Irmen de Jong
> wrote:
>> On 28-3-2010 12:08, Lawrence D'Oliveiro wrote:
>>> Don’t use MD5.
>> Also, md5 is not an encryption algorithm at all, it is a secure hashing
>> function.
> You can use hash functions for encryption.

The purpose of encryption is for the holder of the secret key to be able 
to reverse the encryption easily and reliably, while nobody else can. 
Hash functions fail on three counts.

Since there is no secret key to a hash function, if you can reverse it, 
so can anyone. That alone rules it out as encryption.

Secondly, hash functions are generally difficult to reverse. For 
cryptographic hash functions, ideally they should be impossible to 
reverse short of trying every possible input.

Thirdly, even when reversible, hash functions have collisions. 
Consequently, you can't be sure whether you have found the intended 
message, or merely some random string which happens to accidentally hash 
to the same value.

Admittedly if you found a message that *made sense*, you could make a 
probabilistic argument that it probably was the original message. The 
shorter the message, the more you could be confident that you had found 
the right one: there is probably only one short, grammatically correct, 
semantically meaningful English sentence of less than ten words that has 
a MD5 hex digest of 22008290c5d1ff0bd5fae9e425b01d41, so if you find one, 
it probably will be "Meet at railway station at 3pm".

On the other hand, there are a very large number of (say) 20GB data files 
that hash to 22008290c5d1ff0bd5fae9e425b01d41, and probably no practical 
way of distinguishing the true message from the false collisions. Even if 
you can distinguish them, since the cost of reversing the hash is 
prohibitive, every false positive hurts you a lot.

Of course, none of this is to prohibit using a hash function as a 
component of a larger encryption scheme.


More information about the Python-list mailing list