ziade.tarek at gmail.com
Fri Jul 3 14:57:19 CEST 2009
Ok here's my proposal for the checksum :
- I'll add the "hash_type:" suffix in the record file
- install will get a new option to define what hash should be used
when writing the RECORD file
it will default to SHA1 for 2.7/3.2
- pkgutil, that reads the RECORD files, will pick the right hash
function by looking at the suffix
now for the security it's another story that goes beyond the scope of this PEP
notice though, that the PEP provides a place-holder for distributions metadata,
so it could host a key later on.
On Thu, Jul 2, 2009 at 8:00 PM, Charles Yeomans<charles at declaresub.com> wrote:
> On Jul 2, 2009, at 1:37 PM, Lie Ryan wrote:
>> Joachim Strömbergson wrote:
>>> Tarek Ziadé wrote:
>>>> The prefix is a good idea but since it's just a checksum to control
>>>> that the file hasn't changed
>>>> what's wrong with using a weak hash algorithm like md5 or now sha1 ?
>>> Because it creates a dependency to an old algorithm that should be
>>> deprecated. Also using MD5, even for a thing like this might make people
>>> belive that it is an ok algorithm to use - "Hey, it is used by the
>>> default install in Python, so it must be ok, right?"
>>> If we flip the argument around: Why would you want to use MD5 instead of
>>> SHA-256? For the specific use case the performance will not (should not)
>>> be an issue.
>>> As I wrote a few mails ago, it is time to move forward from MD5 and
>>> designing something in 2009 that will be around for many years that uses
>>> MD5 is (IMHO) a bad design decision.
>>>> If someone wants to modify a file of a distribution he can recreate
>>>> the checksum as well,
>>>> the only secured way to prevent that would be to use gpg keys but
>>>> isn't that overkill for what we need ?
>>> Actually, adding this type of security would IMHO be a good idea.
>> Now, are we actually talking about security or checksum?
>> It has been known for years that MD5 is weak, weak, weak. Not just in
>> the recent years. But it doesn't matter since MD5 was never designed for
>> security, MD5 was designed to protect against random bits corruption. If
>> you want security, look at least to GPG. For data protection against
>> intentional, malicious forging, definitely MD5 is the wrong choice. But
>> when you just want a simple checksum to ensure that a faulty router
>> somewhere in the internet backbone doesn't destroying your data, MD5 is
>> a fine algorithm.
> On the contrary, MD5 was intended to be a cryptographic hash function, not a
> Charles Yeomans
Tarek Ziadé | http://ziade.org
More information about the Python-list