[Python-Dev] ImportError: No module named multiarray (is back)

Mathieu Malaterre mathieu.malaterre at gmail.com
Mon Dec 5 16:26:50 CET 2011


Hi Zbyszek,

  See below my comment.

2011/11/26 Zbigniew Jędrzejewski-Szmek <zbyszek at in.waw.pl>:
> Hi,
> I apologize in advance for the length of this mail.
>
> sys.path
> ========
> When a script or a module is executed by invoking python with proper
> arguments, sys.path is extended. When a path to script is given, the
> directory containing the script is prepended. When '-m' or '-c' is used,
> $CWD is prepended. This is documented in
> http://docs.python.org/dev/using/cmdline.html, so far ok.
>
> sys.path and $PYTHONPATH is like $PATH -- if you can convince someone to put
> a directory under your control in any of them, you can execute code as this
> someone. Therefore, sys.path is dangerous and important. Unfortunately,
> sys.path manipulations are only described very briefly, and without any
> commentary, in the on-line documentation. python(1) manpage doesn't even
> mention them.
>
> The problem: each of the commands below is insecure:
>
> python /tmp/script.py                 (when script.py is safe by itself)
>        ('/tmp' is added to sys.path, so an attacker can override any
>         module imported in /tmp/script.py by writing to /tmp/module.py)
>
> cd /tmp && python -mtimeit -s 'import numpy' 'numpy.test()'
>        (UNIX users are accustomed to being able to safely execute
>         programs in any directory, e.g. ls, or gcc, or something.
>
>         Here '' is added to sys.path, so it is not secure to run
>         python is other-user-writable directories.)
>
> cd /tmp/ && python -c 'import numpy; print(numpy.version.version)'
>         (The same as above, '' is added to sys.path.)
>
> cd /tmp && python
>         (The same as above).
>
> IMHO, if this (long-lived) behaviour is necessary, it should at least be
> prominently documented. Also in the manpage.
>
> Prepending realpath(dirname(scriptname))
> ========================================
> Before adding a directory to sys.path as described above, Python actually
> runs os.path.realpath over it. This means that if the path to a script given
> on the commandline is actually a symlink, the directory containing the real
> file will be executed. This behaviour is not really documented (the
> documentation only says "the directory containing that file is added to the
> start of sys.path"), but since the integrity of sys.path is so important, it
> should be, IMHO.
>
> Using realpath instead of the (expected) path specified by the user breaks
> imports of non-pure-python (mixed .py and .so) modules from modules executed
> as scripts on Debian. This is because Debian installs
> architecture-independent python files in /usr/share/pyshared, and symlinks
> those files into /usr/lib/pymodules/pythonX.Y/. The architecture-dependent
> .so and python-version-dependent .pyc files are installed in
>  /usr/lib/pymodules/pythonX.Y/. When a script, e.g.
> /usr/lib/pymodules/pythonX.Y/script.py, is executed, the directory
> /usr/share/pyshared is prepended to sys.path. If the script tries to import
> a module which has architecture-dependent parts (e.g. numpy) it first sees
> the incomplete module in /usr/share/pyshared and fails.
>
> This happens for example in parallel python
> (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=620551) and recently when
> packaging CellProfiler for Debian.
>
> Again, if this is on purpose, it should be documented.
>
> PEP 395 (Qualified Names for Modules)
> =====================================
>
> PEP 395 proposes another sys.path manipulation. When running a script, the
> directory tree will be walked upwards as long as there are __init__.py
> files, and then the first directory without will be added.
>
> This is of course a fine idea, but it makes a scenario, which was previously
> safe, insecure. More precisely, when executing a script in a directory in a
> parent directory-writable-by-other-users, the parent directory will be added
> to sys.path.
>
> So the (safe) operation of downloading an archive with a package, unzipping
> it in /tmp, changing into the created directory, checking that the script
> doesn't do anything bad, and running a script is now insecure if there is
> __init__.py in the archive root.
>
>
> I guess that it would be useful to have an option to turn off those sys.path
> manipulations.


Thanks very much for the details explanation. Given this, I believe I
can safely give up on CellProfiler packaging until this issue is
addressed upstream (either in CellProfiler using an indirection, or in
python).

Thanks,
-- 
Mathieu


More information about the Python-Dev mailing list