[Tutor] Best way to get root project directory in module search path

Mon Aug 28 22:59:29 EDT 2017

On 28Aug2017 20:39, boB Stepp <robertvstepp at gmail.com> wrote:
>On Sun, Aug 27, 2017 at 6:03 PM, Cameron Simpson <cs at cskk.id.au> wrote:
>> from os.path import dirname
>It did not occur to me to import something from os.path to use sys.path!

It is pretty normal to import specific names from modules rather than 
laboriously writing things out like "os.path.dirname" everywhere. It makes code 
both easier to write and easier to read/maintain/debug.

[...]
>>> BTW, I have not used os.walk yet, but would that be better than my bit
>>> of string manipulation above?  If yes, an example of walking up and
>>> getting the directory exactly one level up from __file__ would be
>>> appreciated.
>
>> Given that os.walk walks _down_ the tree from some directory, how do you
>> think it would help you?
>
>At https://docs.python.org/release/2.4.4/lib/os-file-dir.html it says:
>
>"walk(top[, topdown=True [, onerror=None]])
>walk() generates the file names in a directory tree, by walking the
>tree either top down or bottom up."
>
>So I thought I could go in either direction.

Ah, no. It always walks down. The documentation is discussing in what order 
your call gets the results: upper directories before lower directories or the 
reverse.

>>> I have been using the same project structure for some years now, so in
>>> that sense it is well-established and not fragile.
>>
>> It is more that embedded specific hardwired knowledge in the code is an
>> inherently fragile situation, if only because it must be remembered when
>> naming changes occur. And they do. I have a quite large package I've been
>> working on for a decade and am very seriously contemplating changing its
>> name in the next month or so. Fortunately the installed base is small.
>
>I was working on removing the name dependencies.  So far I had the following:
>
>import os
>import sys
>
>def set_project_root():
>    pgm_dir = os.path.dirname(os.path.abspath(__filename__))
>    dir_to_rmv = pgm_dir.split(os.sep)[-1]
>    pgm_root = pgm_dir.replace(dir_to_rmv, '')
>    sys.path.append(pgm_root)
>
>But this only works if I only need to go up one level.  If I have
>Python files more deeply nested in the project structure than this
>won't work.  So I was pondering how I can do this for arbitrarily
>deeply nested folders without relying on any specific naming.  I was
>also wondering if this sort of code could be put in the __init__.py
>files?  I have yet to do this latter experiment.

If you really want to work this way you could start with dirname(__file__), and 
see if os.path.join(dirname(__file__), '__init__.py') exists. Keep going up 
until you find it. That will find the top of your particular package, and you 
might then presume that the level above that is a useful thing to add to 
sys.path.

But it still feels ... unreliable to me. I really do feel better requiring this 
setup to be correctly sorted before the module gets loaded.

Guessing/inferring where to find other modules (which is what hacking sys.path 
amounts to) is in my mind "policy". I like to keep policy separate from 
mechanism when I can. In this instance, my thinking is that sys.path is the 
mechanism, but deciding what to put in it is policy. Policy is for the caller, 
either directly or courtesy of some sysadmin "install" process.

When you put policy inside your code as an integral and worse, automatic, 
action you're overriding the caller's own policy. That might be bad.

>>> So I am moving to put everything in a centrally accessible
>>> location
>>
>> Very sound. But to use it your users should have that location as part of
>> their python path, or the code needs to be invoked by some wrapper which can
>> insert the needed path. Both these things live _outside_ the package code.
>
>Not being well-versed in Unix, is it possible to change the PYTHONPATH
>*only* for a specific group of users and only for python programs in a
>certain shared library area?  I will have to research this.  If
>possible, this might indeed be the best way.

Um, not very easily? The usual approach is instead to write a wrapper script.

The invoked Python program and everything it invokes will get any changes you 
make to the environment in the wrapper (environment variables such as 
$PYTHONPATH are automatically inhereited from a process to any of its children, 
absent special arrangments to the contrary).

But "programs" are invoked by executing something. You can stick your python 
modules where they belong, and have the programme be just the wrapper shell 
script which knows how to invoke them.

For example, I've got a small python module which has some knowledge about 
reading HAProxy configs and/or its status page, and the module has a command 
line mode (the main function, which would be in __main__ if it were a package - 
it is just a single module). I have a command called "haproxy-tool" for running 
that python function, and it is just a shell script wrapper, thus:

  #!/bin/sh
  exec py26+ -m cs.app.haproxy ${1+"$@"}

"py26+" is itself just a small script to find a python 2 which is at least 
python 2.6.

But you can see that its perfectly reasonable to make the "command" flavour of 
your python module a separate script, and you can put whatever preable you 
like, such as hacking $PYTHONPATH, into that script before it invokes the 
python interpreter.

[...]
>> I think you're better off arranging your users' environments to include the
>> shared library area, or providing a wrapper shell script which does that and
>> then invokes the real code.
>>
>> Eg:
>>
>>  #!/bin/sh
>>  PYTHONPATH=/path/to/your/shared/lib:$PYTHONPATH
>>  export PYTHONPATH
>>  python -m your.module.name ${1+"$@"}
>>
>> Such a script has the advantage that the $PYTHONPATH change applies only to
>> the python running your module, not to your users' wider environment.
>
>This looks doable, though it complicates how I will call my python
>programs.  These are always invoked by the Pinnacle HotScript
>language.  Everything must start from within the Pinnacle planning
>environment.

Well, that makes things easier, potentially. Write a HotScript function to make 
the same arrangement and invoke the Python code. Make all invocations go via 
that function instead of directly.

Cheers,
Cameron Simpson <cs at cskk.id.au> (formerly cs at zip.com.au)