[Distutils] setuptools-0.4a2: Eggs, scripts, and __file__

Phillip J. Eby pje at telecommunity.com
Mon Jun 13 18:21:55 CEST 2005


At 03:12 AM 6/13/2005 -0400, Ryan Tomayko wrote:
>On Jun 13, 2005, at 1:15 AM, Phillip J. Eby wrote:
>>>The script looks like it would work properly if it was given a pseudo
>>>filename but this has me thinking about what the best way to detect
>>>development environments in scripts will look like in an eggified
>>>environment.
>>
>>That's the wrong question to ask, IMO.  Think about how to make the
>>script work exactly the same in all environments, instead.  :)
>
>I'd love to except I can't assume setuptools and eggs-based
>dependencies in all environments at the moment. In particular, Linux
>distributions like Fedora probably won't be moving to egg based
>packaging for some time. If I'm lucky I might see python RPM
>maintainers phase in package.egg-info directories on top of the
>normal site-packages layout over the next few months. What this adds
>up to--if I'm not missing something--is that I can't assume require()
>is going to work. I need to be able to fallback into assuming that
>all dependencies will be laid out for me by some other package
>management system (in this case RPM).
>
>I don't think the setuptools dependency will be hard to deal with but
>egg versions of other dependencies is probably going to be a problem
>for a little while.

Are you distributing an application, or a library?  If you're distributing 
a library, you don't need require() in library code.  If it's an 
application, you can handle your own dependencies by force-installing the 
eggs in the application script directory using EasyInstall, and then just 
require() your main package.


>  I can assume that require() will be there but I'd
>have to try/expect/pass on DependencyNotFound exceptions or
>something. What I'd prefer is to keep require() out of the code
>completely and use .egg-info/depends.txt instead.

That's ideal for library code; startup scripts should just require() their 
target package, and that's almost more for development than anything else, 
since scripts installed by EasyInstall do the necessary require() work in 
pkg_resources.run_main().


>  If I'm running out
>of an egg, I want setuptools to manage requiring everything before my
>script is even called.

Yep, it'll do that.


>This should give me all of the benefits of
>eggs when I'm using them and fallback to the old-style manual
>dependency management otherwise. Does that make sense?

Um, yeah, except I don't think you really need to fall back, just because 
people have other stuff installed.  The worst that's going to happen is 
that you're going to force reinstallation of dependencies they already 
have, just to get them into eggs.  (Or make them create .egg-info 
directories to tell the system the stuff is already installed.)

Hm.  What if you created .egg directories and symlinked the dependencies 
into them during your installation process?  Or if there were some way to 
create the .egg-info directories automatically from packaging system 
databases, or from inspecting module contents?  Setuptools has some code 
that can look for the setting of constants or presence of symbols in 
modules, without importing them.  Perhaps I could extend this somehow so 
that a transitional package like yours could include additional info in the 
setup script, that checks for these dependencies and tags them somehow?

Or maybe this could be done by metadata -- you put a legacy.py file in your 
egg-info, and when processing your egg's dependencies, if pkg_resources 
can't find a package you need, it would call a function in legacy.py that 
would check for the dependency using setuptools' inspection facilities, and 
return a path and a guess at a version number.

How does that sound?


>If I move to egg dependencies
>in development and assume that either setuptools or some other
>package management utility will setup sys.path correctly, I should be
>able to get rid of manual sys.path hackery.

I assume most packaging systems install to site-packages, so if you're 
doing applications, it's basically going to boil down to eggs in the script 
directory plus whatever's in site-packages.


>When I require('MyPackage'), does setuptools look at MyPackage.egg- 
>info/depends.txt and require everything else for me? I'm assuming it
>does and don't see why it wouldn't.

Yes, it does.


>  If that's the case, I might be
>able to make my scripts as simple as::
>
>     from pkg_resources import require, find_distributions
>     if list(find_distributions('MyPackage')):

Don't do this.  find_distributions() yields distributions found in a 
directory or zipfile; it doesn't take a package name, it takes a sys.path 
entry.

I'm not sure exactly what you're trying to do here.  If you just want to 
know if your script is running from a development location (and therefore 
needs to call require() to set up dependencies), couldn't you just check 
for 'MyPackage.egg-info' in the sys.path[0] (script) directory?

e.g.:

     import sys, os
     if os.path.isdir(os.path.join(sys.path[0],"MyPackage.egg-info")):
         from pkg_resources import require
         require("MyPackage")  # ensures dependencies get processed

If this is what you want, perhaps we can create a standard recipe in 
pkg_resources, like maybe 'script_package("MyPackage")', that only does the 
require if you're a development egg and not being run from run_main().


>The downside to this approach is that I would have to be sure to NOT
>distribute MyPackage.egg-info with RPMs and other packages, which
>kind of rules out any phased approach to bringing egg based packaging
>to Fedora's stock RPMs.

I don't know if .egg-info is a good idea for RPMs.  .egg-info is primarily 
intended for *development*, not deployment, because you can't easily 
override a package installed with .egg-info in site-packages.  In fact, the 
only way you can normally override it is to install an egg alongside the 
script.

My current idea for how RPMs and other packagers should install eggs is 
just to dump them in site-packages as egg files or directories, and let 
people use require() or else use EasyInstall to set the active package.

Hey, wait a second...  if you can put install/uninstall scripts in 
packages, couldn't installing or uninstalling an RPM ask EasyInstall to fix 
up the easyinstall.pth file?  This would let packagers distribute as eggs, 
but without breaking users' expectations that the package would be 
available to "just import".  If somebody explicitly wants to support 
multiversion for a package, they can run 'EasyInstall -m PackageName' to 
reset it to multi-version after installing a new version.

EasyInstall doesn't have everything that's needed to do this yet (no 
"uninstall" mode), but perhaps it's a good option to add, and then 
packagers could standardize on this approach.


>I don't know - none of these seem to be perfect solutions, but none
>of them would have taken me as much time to implement as writing this
>email either. Still, it seems worth pointing out that keeping the
>number of code level require() calls to a minimum and having some way
>of switching those few calls off and on based on environment is
>something packages that need to be included in a non-egg-based
>distribution will need to think about.

Yep.



More information about the Distutils-SIG mailing list