[Distutils] Don't Use `sudo pip install´ (was Re: [final version?] PEP 513…)

Tue Feb 16 21:22:29 EST 2016

> On Feb 16, 2016, at 6:12 PM, Glyph Lefkowitz <glyph at twistedmatrix.com> wrote:
> 
>> 
>> On Feb 16, 2016, at 5:00 PM, Noah Kantrowitz <noah at coderanger.net> wrote:
>> 
>> 
>>> On Feb 16, 2016, at 4:46 PM, Glyph Lefkowitz <glyph at twistedmatrix.com> wrote:
>>> 
>>>> 
>>>> On Feb 16, 2016, at 4:33 PM, Noah Kantrowitz <noah at coderanger.net> wrote:
>>>> 
>>>> 
>>>>> On Feb 16, 2016, at 4:27 PM, Glyph Lefkowitz <glyph at twistedmatrix.com> wrote:
>>>>> 
>>>>> 
>>>>>> On Feb 16, 2016, at 4:13 PM, Noah Kantrowitz <noah at coderanger.net> wrote:
>>>>>> 
>>>>>> As someone that handles the tooling side, I don't care how it works as long as there is an override for tooling a la Chef/Puppet. For stuff like Supervisord, it is usually the least broken path to install the code globally.
>>>>> 
>>>>> I don't know if this is the right venue for this discussion, but I do think it would be super valuable to hash this out for good.
>>>>> 
>>>>> Why does supervisord need to be installed in the global Python environment?
>>>> 
>>>> Where else would it go? I wouldn't want to assume virtualenv is installed unless absolutely needed.
>>> 
>>> This I can understand, but: in this case, it is needed ;).
>>> 
>>>> Virtualenv is a project-centric view of the world which breaks down for stuff that is actually global like system command line tools.
>>> 
>>> [citation needed].  In what way does it "break down"?  https://pypi.python.org/pypi/pipsi is a nice proof-of-concept that dedicated virtualenvs are a better model for tooling than a big-ball-of-mud integrated system environment that may have multiple conflicting requirements.  Unfortunately it doesn't directly address this use-case because it assumes that it is doing per-user installations and not a system-global one, but the same principle holds: what version of `ipaddress´ that supervisord wants to use is irrelevant to the tools that came with your operating system, and similarly irrelevant to your application.
>>> 
>>> To be clear, what I'm proposing here is not "shove supervisord into a venv with the rest of your application", but rather, "each application should have its own venv".  In supervisord's case, "python" is an implementation detail, and therefore the public interface is /usr/bin/supervisord and /usr/bin/supervisorctl, not 'import supervisord'; those should just be symlinks into /usr/lib/supervisord/environment/bin/
>> 
>> That isn't a thing that exists currently, I would have to make it myself and I wouldn't expect users to assume that is how I made it work. Given the various flavors of user expectations and standards that exist for deploying Python code, global does the least harm right now.
> 
> I don't think users who install supervisord necessarily think they ought to be able to import supervisord.  If they do expect that, they should probably revise their expectations.
> 
> Here, I'll make it for you.  Assuming virtualenv is installed:
> 
> python -m virtualenv /usr/lib/supervisord/environment
> /usr/lib/supervisord/environment/bin/pip install supervisord
> ln -vs /usr/lib/supervisord/environment/bin/supervisor* /usr/bin
> 
> More tooling around this idiom would of course be nifty, but this is really all it takes.
> 
>>> In fact, given that it is security-sensitive code that runs as root, it is extra important to isolate supervisord from your system environment for defense in depth, so that, for example, if, due to a bug, it can be coerced into importing an arbitrarily-named module, it has a restricted set and won't just load anything off the system.
>> 
>> Sounds cute but the threats that actually helps with seem really minor. If a user can install stuff as root, they can probably do whatever they want thanks to .pth files and other terrible things.
> 
> Once malicious code is installed in a root-executable location it's game over; I didn't mean to imply otherwise.  I'm saying that since supervisord might potentially import anything in its site-packages dir, this is just less code for you to worry about that might have security bugs in it.
> 
> One specific example of how you might do this is by specifying a protocol-defined codec; if you ever do .decode(user_data) on a string you're doing an attacker-controlled dynamic import.  This is a bug, of course, but a harmless one if you have a small set of modules with no surprises lurking in store.  But, if the attacker can 'import qt' (whose default behavior was to abort() if it couldn't open $DISPLAY for many years, not sure if it still is) from the system, or anything like that, you have potential crashes on your hands.
> 
>>>> Compare with `npm install -g grunt-cli`.
>>> 
>>> npm is different because npm doesn't create top-level script binaries unless you pass the -g option, so you need to install global tooling stuff with -g.  virtualenv is different (and, at least in this case, better).
>> 
>> Pip also doesn't generate binstubs in /usr/bin unless you install globally so pretty much same difference.
> 
> Pip always generates binstubs into whatever prefix you're installing into, whereas npm sometimes doesn't generate binstubs at all; when it does generate them, it puts them in a package-specific directory and not in a common location.  (I don't fully understand the specifics; npm generates local binstubs for coffeescript but not for grunt, for example.)  It's fine to symlink pip's stubs.  Is making the symlink really the sticking point?
> 
> So far, in the "use virtualenv" column, I've got:
> 
> 	• don't break tooling written in python in the host operating system by installing a conflicting dependency by accident
> 	• don't break the host operating system's package database by potentially overwriting packages installed by the package manager
> 	• don't break other pip-installed tools using the system or --user environments
> 	• don't let installing or upgrading any of those things accidentally break the tool (supervisord in this case) later
> 	• make provisioning possible by an unprivileged user, reducing the amount of code that needs to run as root (you can make /usr/lib/supervisord/environment writable by a dedicated user during the install process, to ensure that it doesn't provision anything outside of that directory). This is potentially useful because some setup.py scripts "helpfully" end up doing other weird stuff besides installing the package - there are a few who will remain nameless to protect the guilty which write a bunch of files into root's home directory, for example.
> 	• reduce the potential attack surface of any application with plugins by reducing the number of things that can get imported.
> 
> and on the "use sudo pip install" side, I've got:
> 
> 	• don't have to make a symlink
> 	• users expect applications to install importable modules

I'm not concerned with if the module is importable specifically, but I am concerned with where the files will live overall. When building generic ops tooling, being unsurprising is almost always the right move and I would be surprised if supervisor installed to a custom virtualenv. It's a weird side effect of Python not having a great solution for "application packaging" I guess? We've got standards for web-ish applications, but not much for system services. I'm not saying I think creating an isolated "global-ish" environment would be worse, I'm saying nothing does that right now and I personally don't want to be the first because that bring a lot of pain with it :-)

--Noah

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 163 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20160216/d3679326/attachment.sig>