[Distutils] Experience of setuptools' cache design

Fri Jan 18 11:34:13 CET 2008

Hello.

Being nothing but an innocent bystander, I have not thuroughly  
searched the archives of this list, and might be writing about  
something that is "old news" since long ago. If so, I apologize.

This is a small story about what I, being an end user that did not  
know what "setuptools" was, experienced when I installed the latest  
MySQLdb module (and I am aware that there might very well be issues  
with how it was installed).

My hope is that this is useful to the developers of setuptools in  
determining if they meet the design criteria they have set up.

A colleague of mine installed the latest MySQLdb module, which uses  
eggs and setuptools. We use Python and MySQL heavily on our mailserver.

In that environment, user names and home directories do not always  
match, and indeed most users running the programs are system users  
and do not have home directories at all. This showed to be a problem  
with the setuptools-based MySQLdb module.

The reason is the .python-eggs directory.

Our configuration daemon runs as a system user "graald", a user  
without a home directory (that user does not have a vaild shell  
either). After updating MySQLdb to the setuptools-based version, the  
Python server code no longer started, because setuptools tried to  
write to the .python-eggs directory in the user's $HOME. And $HOME is  
not a valid directory (or, often, the home directory of the person  
starting the script rather than the account it runs as).

Ok, so we can set the PYTHON_EGG_CACHE environment variable, which  
solves the problem (but in an unclean way).

Then comes the Python script that runs as "virtmail" and tries to  
connect to MySQL, "virtmail" being another user without a home  
directory. And of course it does not start, and cannot have the same  
PYTHON_EGG_CACHE directory, since both these users cannot write to  
the same directory. And, not having a home directory, the "virtmail"  
user does not have a .bashrc or similar where we can set that user's  
own cache directory.

We cannot set PYTHON _EGG_CACHE in the script itself either, as we  
cannot know what to set it to. Because if two different system users  
run the same script, they need different cache directories. Uh-oh. We  
need to implement code to direct the cache directory depending on the  
user that's running.

And when I tried to run a script under my own user, with a proper  
home directory, nothing worked as expected either. The reason? I have  
previously run a Python script as root (but without changing my $HOME  
at that time). So now I have a .python-eggs in my own home directory  
owned by root:root.

My thoughts (perhaps useful, perhaps not):

* Primarly, I think it is unfortunate that an "import foo" starts  
creating files in the file system - it is not what I personally  
expect from doing an "import"!

* On our user system, with some 20.000 active users, there will be up  
to 20.000 copies of a .python-eggs directory if someone installs a  
program that uses a Python Egg (but does not have access to site- 
packages or does not know how to detect what eggs are used and how to  
install them there).

* I, personally, think it would be better if I explicitly have to  
_request_ a per-user cache directory being made, rather than needing  
to implement solutions to _prevent_ that from happening.

* If the default is to remain to create files on "import", I would  
like error checking and fall backs. If the cache directory cannot be  
created in $HOME, I would like the code to create it somewhere else  
(or not at all) instead of giving me an exception. As end user I did  
not request the cache-directory to be made, and therefore do not want  
to be given an exception caused by it not being created. Especially  
as I do not know what to do with such an exception. Perhaps creating  
it in /tmp/python-eggs-$USERNAME, for example.

Thanks for listening.
/Viktor