Benjamin Bach and Hanno Böck are running https://www.pytosquatting.org/ and registered many projects lilke https://pypi.python.org/pypi/urllib2 "In June 2016, Typosquatting programming language package managers stated that urllib2 had ~4,000 downloads in 2 weeks. The package name is now squatted by us (the good guys). We take these findings seriously." It seems like we need a solution to prevent that a project removed because it contains malicious code, can be recreated automatically. pytosquatting.org projects contain a download file: a tarball with a setup.py file. This setup.py raises an exception, but also send a HTTP request, a "pingback", to their server. Thank you for reserving names of the standard library. But I'm not sure of the HTTP "pingback" part. It can be on CIs, a restricted environments, etc. Why not just reserving the name but don't provide any download file? With no download file, the user will likely understand his/her error, no? Note: I don't think that Benjamin Bach and Hanno Böck are related to the PSRT nor PyPI administrators. Victor 2017-09-15 22:28 GMT+02:00 Victor Stinner <victor.stinner@gmail.com>:
Hi,
Last week, the National Security Authority of Slovakia contacted the Python Security Response Team (PSRT) to report that the Python Package Index (PyPI) was hosting malicious packages. Installing these packages send user data to a HTTP server, but also install the expected module so it was an easy to notice the attack.
Advisory: http://www.nbu.gov.sk/skcsirt-sa-20170909-pypi/
Kudos to them to report the issue!
It's not a compromise of the PyPI server nor a third-party project, but the "typo squatting" issue which is known since at least June 2016 (for PyPI). The issue is not specific to Python, npmjs.com or rubygems.org are vulnerable to the same issue.
For example, a malicious package used the names "urllib" (no 3) and "urlib3" (1 L) instead of "urllib3" (2 L). These packages were downloaded by users, so the attack was effective.
More information on typo squatting and Python package security: https://python-security.readthedocs.io/packages.html#pypi-typo-squatting
The PRST contacted PyPI administrators and all identified packages were taken down, only 1h10 after the PSRT received the email from the National Security Authority of Slovakia!
The typo squatting issue is known and discussed, but not solution was found yet. See for example this warehouse issue: https://github.com/pypa/warehouse/issues/2151
It seems like the consensus is that pip is not responsible to detect malicious code, it's more the responsability of PyPI.
The problem is to decide how to detect malicious code and/or prevent typo squatting on PyPI.
The issue has been discussed privately on the PSRT list last week. The National Security Authority of Slovakia just published their advisory, and a public discussion started on reddit: https://news.ycombinator.com/item?id=15256121
I consider that it's now time to find a solution on the public python-dev mailing list.
Let's try to find a solution!
Can we learn something from the Update Framework (TUF)?
How does Javascript, Ruby, Perl and other programming languages deal with these security issues on their package manager?
See also my other notes on Python security and the list of known CPython vulnerabilities: https://python-security.readthedocs.io/
Victor