[Python-ideas] shutil.runret and shutil.runout

anatoly techtonik techtonik at gmail.com
Fri Feb 24 14:00:25 CET 2012


On Fri, Feb 24, 2012 at 2:50 PM, Masklinn <masklinn at masklinn.net> wrote:
> On 2012-02-24, at 12:12 , anatoly techtonik wrote:
>>
>> 1. they require try/catch
>
> No.

Quote from the docs:
"Run command with arguments. Wait for command to complete. If the
return code was zero then return, otherwise raise CalledProcessError."
http://docs.python.org/library/subprocess.html#subprocess.check_call

>> 2. docs still refer Popen, which IS complicated
>
> True.
>
>> 3. contain shell FUD
>
> No, they contain warnings, against shell injection security
> risks. Warnings are not FUD, it's not trying to sell some sort
> of alternative it's just warning that `shell=True` is dangerous
> on untrusted input.

Warnings would be o.k. if they provided at least some guidelines where
shell=True can be useful and where do you need to use Popen (or
escaping). Without positive examples, and a little research to show
attack vectors (so that users can analyse if they are applicable in
their specific case) it is FUD IMO.

>> 4. completely confuse users with stdout=PIPE or stderr=PIPE stuff
>>
>> http://docs.python.org/library/subprocess.html#subprocess.check_call
>
> On the one hand, these notes are a bit clumsy. On the other hand,
> piping is a pretty fundamental concept of shell execution, I see
> nothing wrong about saying that these functions *can't* be involved
> in pipes. In fact stating it upfront looks sensible.

The point is that it makes things more complicated than necessary. As
a system programmer I feel confident about all this stuff, but users
struggle to get it and they blame Python for complexity, and I have to
agree. We can change that with high level API. The API that will
automatically provide a rolling buffer for output if required to avoid
locks (for the missing info as a drawback), and remove headache about
"what to do about that?".

>>> If you do "pip install shell-command" you can also access the
>>> shell_call(), shell_check_call() and shell_output() functions I
>>> currently plan to include in subprocess for 3.3. (I'm not sure which
>>> versions of Python that module currently supports though - 2.7 and
>>> 3.2, IIRC).
>>
>> Don't you find strange that shell utils module don't have any
>> functions for the main shell function - command execution?
>
> What "shell utils" module? Subprocess has exactly that in `call`
> and its variants. And "shutil" does not bill itself as a
> "shell utils" module right now, its description is
> "High-level file operations".
>
>> shutil.runret()  - by definition has shell=True
>
> Great, so your recommendation is to be completely insecure by default?

Not "by default" - only if it is impossible to make shutil.run*()
functions more secure. They only make sense with shell=True, so my
recommendation is to analyse security implications and *let* users
make their grounded choice. Not frighten them, but making them think
about security.

The difference. User friendly docs for shutil.run*() docs should be
structured as following:
1. you are free to use these functions
2. but know that they are insecure
3. in these cases:
3.1
3.2
3.3
4. if you think these cases won't apply to your project, then feel
free to use, otherwise look at subprocess

Of course, if some cases 3.1-3.3 have workarounds, they should be mentioned.

>> That's a high-level _user_ function. When user runs command in shell
>> he sees both. So, this 'shell util' is an analogue.
>
> That makes no sense, when users invoke shell commands programmatically
> (which is what these APIs are about), they expect two semantically
> different reporting streams to be split, not to be merged,
> indistinguishable and unusable as a default. Dropping stderr on the
> ground may be an acceptable default but munging stdout and stderr is not.

Conflict point:
Do users care about stdout/stderr when they invoke shell commands?
Do users care about stdout/stderr when they use Python syntax for
invoking shell commands?

These functions is no a syntax sugar for developers (as the
aforementioned "alternatives" from subprocess modules are). They are
helper for users. If you're a developer, who cares about pipes and
needs programmatic acces  - there is already a low level subprocess
API with developer's defaults. If we speak about users:

The standard shell console behaviour is to output both streams to the
screen. That means that if I want to process this output, I don't know
if it comes from stderr or stdout. So, if I want to process the output
- I use Python to do this. If I know what I need the output from
stderr only, I specify this explicitly. That's my default user story.

>> The main purpose of this function is to be useful from Python console
>
> Then I'm not sure it belongs in subprocess or shutil, and users with that
> need should probably be driven towards iPython which provides extensive
> means of calling into the system shell in interactive sessions[0].
> bpython may also provide such facilities.

I think it is a good idea to unify interface across interactive mode
in Python. Hopefully shutil.copy and friends are already good enough
so that they don't have reasons to reimplement them (and users to
learn new commands).

>> The main purpose of this function is to be
>> useful from Python console, so the interface should be very simple to
>> remember from the first try. Like runout(command,
>> ret='stdout|stderr|both').
>
> As opposed to `check_output(command)`?

As opposed to check_output(command, *, stdin=None, stdout=None,
stderr=None, shell=True)

>> It won't be 'shell util' function anymore. If you're using shell
>> execution functions, you already realize that will happen if your
>> input parameters are not validated properly.
>
> This assertion demonstrably does not match reality, shell injections
> (the very reason for this warning) would not exist if this were the
> case.

It is not assertion, it is a wannabe for shutil documentation to
clarify shell injections problems to the level that allow users to
make a reasonable choice, so if the user is "using shell execution
functions he already realizes that will happen if his input parameters
are not validated properly".

-- 
anatoly t.



More information about the Python-ideas mailing list