[Tutor] Open a libreoffice calc file in Python

Thu Dec 22 16:20:11 EST 2016

On Thu, Dec 22, 2016 at 4:20 PM, boB Stepp <robertvstepp at gmail.com> wrote:
>
> Both you and Eryk seem to be speaking in terms of using
> subprocess.Popen() directly.  So I think I need some clarification.
> At https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module
> it says:
>
> "The recommended approach to invoking subprocesses is to use the run()
> function for all use cases it can handle. For more advanced use cases,
> the underlying Popen interface can be used directly.

The high-level functions such as `run` pass their positional arguments
and Popen keyword arguments to the Popen constructor. Everything
stated about handling args as a list or string and shell=True applies
the same to the high-level API.

> My current understanding as to why the subprocess module is preferred
> to using the older os.system() is to avoid shell injection attacks.
> So my assumption is that even when using "shell=True" with either
> run() or Popen(), this is avoided.  Is this true?

Using shell=True is insecure. The command-line string is passed
directly to the shell, the same as with os.system. Try to avoid using
shell=True as much as possible, especially when the command is based
on user input.

But the subprocess module has more to offer other than just avoiding
the shell. For example, it allows communicating directly with the
child process using standard I/O (stdin, stdout, stderr) pipes;
controlling inheritance of Unix file descriptors or Windows handles;
and creating a new Unix session or Windows process group.

On Windows, Popen also allows passing creationflags [1] (e.g.
CREATE_NEW_CONSOLE, CREATE_NO_WINDOW, DETACHED_PROCESS) and a subset
of the process startupinfo [2], including the standard handles and
wShowWindow.

[1]: https://msdn.microsoft.com/en-us/library/ms684863
[2]: https://docs.python.org/3/library/subprocess.html#subprocess.STARTUPINFO

> Are there subtleties as to when to use run() and when to use Popen()?

The high-level API executes a child process synchronously -- i.e.
write some (optional) input, read the output, close the standard I/O
files, and wait for the process to exit. For example, here's the
implementation of `run` in 3.5:

    def run(*popenargs, input=None, timeout=None, check=False, **kwargs):
        if input is not None:
            if 'stdin' in kwargs:
                raise ValueError(
                    'stdin and input arguments may not both be used.')
            kwargs['stdin'] = PIPE
        with Popen(*popenargs, **kwargs) as process:
            try:
                stdout, stderr = process.communicate(
                                    input, timeout=timeout)
            except TimeoutExpired:
                process.kill()
                stdout, stderr = process.communicate()
                raise TimeoutExpired(process.args, timeout,
                                     output=stdout, stderr=stderr)
            except:
                process.kill()
                process.wait()
                raise
            retcode = process.poll()
            if check and retcode:
                raise CalledProcessError(retcode, process.args,
                                         output=stdout, stderr=stderr)
        return CompletedProcess(process.args, retcode, stdout, stderr)

In most cases the high-level API is all that you'll need. Rarely you
may need to execute a child process asynchronously and interact with a
long-running process. In that case call Popen directly.