[Distutils] a plea for backward-compatibility / smooth transitions (was: Re: Migrating Hashes from MD5 to SHA256)

Donald Stufft donald at stufft.io
Mon Jul 29 16:30:18 CEST 2013


On Jul 29, 2013, at 7:58 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> 
>> Actually, i strongly object further backward-incompatible changes.
>> 
>> Please (generally) find a way to introduce improvements without breaking
>> existing installation processes at the same time.
>> 
>> For example, in this case pip/easy_install could indicate to PYPI what
>> kind of hashes it accepts (through a header or query param or whatever)
>> and PyPI could serve it but we'd default to MD5 for now if nothing else
>> was requested.  Please also consider the PEP438 vetted registration of
>> externals+hashses in this context.  Once things and tools are working
>> nicely we can switch to serving a non-MD5 hash as default after a
>> sufficient grace period.
> 
> Having the improved hashes be opt-in (by the client) strikes me as a
> reasonable request.
> 
> Yes, this means nothing will actually happen until easy_install/pip
> are updated to request those improved hashes and those versions see
> significant uptake, but as Holger says, we need to ensure we put
> sufficient effort into smoothing out the roller coaster ride that has
> been the recent experience of packaging system users.

There's basically zero way for this to fail closed in any of the current installers. The
failure mode is unverified packages not uninstallable packages. I am not aware of
a single installer that mandates the use of a hash. Crate.io has never used md5
hashes and has always used sha256 and I've never received a single report of
an installer being unable to install because of it, which is exactly what I expect.

Indicating via Header or query param pretty much destroys the effectiveness of the
CDN's cache in order to fix a problem with a theoretical (as far as I am aware)
installer that requires a md5 hash (and thus has never worked for any of the externally
hosted packages. Additionally it doesn't account for external urls which need to be
registered *with* a hash.

As far as available attacks, *today* an author could upload a package that has been
created so as to have a sister package with malicious code that has the same hash
allowing them to have a malicious package they can substitute at will without the hashes
changing at all. In the future it's possible that a pre-image attack on MD5 will be found
and then we'll be dealing with this problem then when we've lost all verification on
external urls instead of now when we have time to get external urls to switch.

So by all means I will not migrate us if that's what you want. Old versions of the
installation clients stick around far to long for the opt in mechanism to be much
use. The point of switching was to cover the existing clients as well to narrow
the gap until a new API is developed. Hopefully no one is relying on these
hashes to prevent an author from maliciously injecting a sister package and
hopefully the strength of MD5 holds and no new research is found that blows
it's pre-image attack residence to pieces.

As far as not breaking things goes backwards compatibility has been an important
concern however progress forward *requires* breakage. It is required because there
is a vast array of available ways to have your package and/or hosting configured
many of them horrible practices which need to be killed. Killing them requires breaking
backwards compatibility. You cite SSL, yes SSL has caused a number of errors for
people mostly related to older versions of OpenSSL being unable to use a SSL
certificate but downloading code you're going to execute over plaintext isn't just
bad, it's downright negligent on the part of the toolchain. So that was a required breakage.

You also mention the pip 1.4 *not* installing pre-releases by default. Yes that
broke a handful of packages Supervisor and pytz being the major ones that I've
seen anyone complain about. It was also known ahead of time that this was a
backwards incompatible change (and it was noted as such in the release notes). It
wasn't a surprising outcome. The pip developers "drew a line in the sand" to quote
Paul Moore and I expect pip 1.5 where PEP438 becomes default to break even more
packages from people who just haven't bothered to change their practices until
it's forced on them.

-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20130729/167faa6d/attachment.pgp>


More information about the Distutils-SIG mailing list