[Tutor] What is the best way for a program suite to know where it is installed?

Oscar Benjamin oscar.j.benjamin at gmail.com
Sun Oct 21 17:08:16 EDT 2018


(Resending to the list - sorry for the duplicate, Bob)

On Sun, 21 Oct 2018 at 04:02, boB Stepp <robertvstepp at gmail.com> wrote:
>
> I was just re-reading the entire thread at
>     https://www.mail-archive.com/tutor@python.org/msg77864.html
> And I have asked similar questions previous to that thread.  I still
> do not have a satisfactory answer for the subject line's question that
> both makes sense to me and seems to be good programming practice.  And
> that works for any OS that the program suite is installed under.
>
> A constantly recurring situation I am having is a programming project
> with a nested directory structure.  Usually I will have "tests" and
> "data" folders nested under the top level project folder.  There may
> be others as well depending on the complexity of the project I am
> attempting.

Is this project a library or an application?

If it is a library then it is important to do things in the way that
users of your library will expect so that they can reasonably install
and use your library or package it up along with other libraries as
part of an application. If it is an application then the requirements
are weaker - it really depends how you package and distribute your
application which is something that is under your control.

> As far as I can tell, I cannot make any assumptions about
> what the current working directory is when a user starts the program.

That depends what kind of an application/library it is and how it is
started. For a library you certainly shouldn't make assumptions about
this. Command line programs that are installed on PATH can be run from
any directory. GUI programs could be run with anything as the current
working directory.

It may not make sense for your program to be installed on PATH though
and you might document that a user should run your program by running
a particular command in a particular directory. There might be other
reasons for doing this such as ensuring that any Python modules are
importable.

Generally speaking though it shouldn't be necessary to depend in any
way on the current working directory just to locate your applications
data files.

> So how can one of my programs *know* what the absolute path is to,
> say, the top level of the program suite?

I assume here that you mean "how can code in a particular Python
module find the base directory that my application/library is
installed in?". The simplest answer is to use __path__/__file__. If
you have many Python files organised into packages and subpackages etc
then you should do this in one place near the top. So if you have

   pgk
       config.py
       data.txt
       subpkg1
            submod1.py

then config.py can have the code to check __file__ and work out the
path of the directory containing the pkg folder. The other modules can
all import that from pkg.config.

> If I could always reliably
> determine this by my program's code, every other location could be
> determined from the known nested structure of the program suite.  [But
> I have a dim recollection that either Ben or Cameron thought relying
> on the "known nested structure" is brittle at best.]

One way in which this can go wrong is if you are distributing a
library. It is possible that your library could end up being embedded
in a zip-file so that the files don't exist unzipped on disk.

For the library case it certainly used to be recommended to use pkg_resources to
https://setuptools.readthedocs.io/en/latest/pkg_resources.html#basic-resource-access

So for the example above you would use something like:
    data_bytes = pkg_resources.resource_string('pkg', 'data.txt')

> So far the best method I've come up with is to make use of "__file__"
> for the initiating program file.  But from past discussions I am not
> certain this is the *best* way.  Is the *best* way going to get me
> into best techniques for installing and distributing programs?  If so,
> this strikes me as a huge mess to dive into!

I think that for a simple application this is fine. You are in control
of how your application is packaged and installed so you can ensure
that this will always work if someone runs your application in the
intended way.

So you can do something like:
    project_root = os.path.abspath(os.path.dirname(__file__))
    data_file_path = os.path.join(project_root, 'data'.txt')

In the other thread you seemed confused that the effect of abspath (or
similarly realpath) for a given input string depends on the current
directory. That's because the contents of the string __file__ may also
depend on the current working directory. The point of abspath/realpath
is that these effects should cancel so that the resulting string does
not depend on the current working directory.

--
Oscar


More information about the Tutor mailing list