On Nov 12, 2014, at 2:34 PM, Alex Gaynor email@example.com wrote:
Right now, PyPI provides MD5 hashes for packages, which is used by pip to check for corruption in transit. I'd like to propose we replace MD5 with SHA256 for PyPi, and move to deprecate MD5 support in pip and setuptools.
Why should we do this? MD5 is broken. Collision resistance is totally 100% uselessly busted, and pre-image resistance is mathematically broken; practical attacks aren't known publicly, but it's reasonable to assume private attacks are strong because (sing it with me): "Attacks only get better".
So MD5 doesn't provide the guarantees one might expect; SHA256 is not broken in these ways. But it's not just not providing value, it's actively causing problems: some machines, such as those with packages compiled to meet FIPS-140-2 do not have MD5 available at all, and so pip's verification raises an exception.
While one might be inclined to find a way to silently support both machine configurations, I'd like to instead say we should abhor any additional configuration (whether user supplied or auto-detected) and instead simply upgrade the hashes offered by PyPI, and begin the deprecation process for MD5 in pip.
There are currently 60 packages on PyPI which are *not* hosted on PyPI, but do have MD5 hashes there. For these packages we could download the package, verify the MD5 hash, and then upgrade what PyPI stores to be SHA256.
+1 from me.
Security wise pip supported sha256 before it supported TLS so for pip anything that has the ability to securely fetch the sums from the /simple/ pages has the ability to use sha256. For setuptools there was a small window where setuptools implemented TLS verification (0.7) and implemented the support for things other than MD5 (0.9). However I don’t think this small window represents a large (or any?) number of users.
IOW the impact should be non-existent other than having better digests.
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA