[Distutils] Malicious packages on PyPI
prometheus235 at gmail.com
Fri Jun 2 00:04:12 EDT 2017
I suggested on one of those issues to try to auto-blacklist common 404s as
that should pose a negligible usability hit. I'd like to start by logging
them to collect data, but I'm confused nowadays as to if that should go
into pypa/warehouse or pypa/pypi-legacy. How long until warehouse is where
most requests go, or do some go there right now, but from which
clients...so confuz, plz halp.
On Thu, Jun 1, 2017 at 6:29 PM, Donald Stufft <donald at stufft.io> wrote:
> On Jun 1, 2017, at 6:20 PM, Jannis Gebauer <ja.geb at me.com> wrote:
> This makes me remember https://hackernoon.com/building-a-botnet-on-pypi-
> be1ad280b8d6 on a related note.
> Yep, that’s basically the same thing. Instead of using package names of
> builtins, the attacker is using a combination of popular apt/yum packages
> with a mix of package names with typos.
> During development, it’s not uncommon to make mistakes like:
> pip install requirements.txt (forgot the -r)
> pip install requestd (typo)
> pip install tkinter (not registered)
> Or to use the wrong package manager (apt-get install python-dev vs. pip
> install python-dev).
> I wonder if it would make sense to build some kind of blacklist for this.
> According to the blog post there were close to 10k installs over a period
> of just three days. I believe Debian is running some kind of popularity
> contest for their packages which could be used to identify problematic
> packages. This will be a lot of manual work, but I’d work on a list like
> Folks have suggested mining the logs from PyPI looking for 404 results on
> ``/simple/`` to highlight common packages that are being installed which
> don’t yet exist, then using that data to inform a sort of automatic
> blacklist for new project names.
> Other folks have suggested that trying to use some sort of algorithm with
> existing names to figure out common typos is a solution.
> Ultimately the thing that’s missing is someone to spend the time to figure
> out a good solution and implement it. I will get to it eventually, but if
> someone feels enthused to make it happen sooner, then their contribution
> would be appreciated.
> Donald Stufft
> Distutils-SIG maillist - Distutils-SIG at python.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Distutils-SIG