[Distutils] setuptools-0.4a2: Eggs, scripts, and __file__
Phillip J. Eby
pje at telecommunity.com
Tue Jun 14 01:51:22 CEST 2005
At 06:00 PM 6/13/2005 -0400, Ryan Tomayko wrote:
>On a OS X darwinports box:
>
> $ port list | grep -e '^py-' | wc -l
> 237
>
>On a Fedora 3 box (Core + Extras):
>
> $ yum list all | grep -e '^py' -e 'python' | wc -l
> 78
Impressive. :)
>I don't have any debian or gentoo boxes handy but I imagine they'd
>weigh in somewhere around the darwinports number.
>
>None of these packages are currently provided as eggs or with .egg- info
>directories when they are installed to site-packages and they
>have complex dependency relationships that are managed by the
>distribution's utility (port, yum, apt-get, emerge, etc.) This
>creates a problem for these packages because it means that they can
>not assume dependencies will always be egg managed. If they start
>adding require() calls to their scripts, they will break under these
>environments. require() is an all or nothing proposition for
>distributions and that means there will need to be a planned
>"upgrade" period or something for all packages.
>
>As a more specific example, I contribute to two packages that are
>distributed with Fedora Core: python-urlgrabber and yum. yum depends
>on python-urlgrabber and python-elementtree. Now, if I wanted to move
>yum to be egg based and use require(), I would also need to ensure
>that all yum's dependencies are egg based. When yum (and its
>dependencies) are installed from RPM, they must all be in egg format
>(or at least provide .egg-info dirs). If not, the yum script will fail.
>
>So a single package using require() can cause a snowball effect where
>many other packages would need to be upgraded to egg format as well.
>In time, this may be a good thing because it could accelerate
>adoption of eggs but for the time being it makes it really hard to
>use require().
I'm not seeing how this is any different than if you just started requiring
a newer version of a package. I mean, if 'yum' needed a newer version of
elementtree, it would force an upgrade. So why can't you just rely on a
later "port number"? ISTM that most packaging systems have something like
'-1' or 'p1' or 'nb1' (NetBSD) tagged on a revision to identify changes in
the packaging or platform-specific patches applied. Couldn't you use that
to make your 'yum' RPM depend on egg-packaged versions of its dependencies?
I understand you're saying it's a big problem, but the truth is that
relatively few existing Python packages have a lot of dependencies; the
dependency tree of the 237 darwinports is probably extremely flat. The
problems today of depending on anything are such that few people do; this
makes it relatively simple for the maintainer of a single port to just go
ahead and upgrade the dependencies, too (organizational issues
notwithstanding).
But I am obviously no expert in these matters, so I defer to you here. I'm
just saying that distribution packages that depend on more than one or two
other packages are rare in Python today, and the things that do get
depended on, tend to be frequently used, so when you do port a dependency,
it significantly reduces the number of dependencies that *need* to be
ported. Thus, I think that although the problem appears huge in potential,
I think that the actual interconnectedness of the packages is probably
quite small.
>>Or maybe this could be done by metadata -- you put a legacy.py file
>>in your egg-info, and when processing your egg's dependencies, if
>>pkg_resources can't find a package you need, it would call a
>>function in legacy.py that would check for the dependency using
>>setuptools' inspection facilities, and return a path and a guess at
>>a version number.
>>
>>How does that sound?
>
>That would solve my problem perfectly.
I'll give this some thought for the 0.5/0.6 releases, then.
Interestingly enough, this technique could possibly give someone the
opportunity to do things like look for dynamic link libraries or headers,
check operating system versions, etc.
>>I'm not sure exactly what you're trying to do here. If you just
>>want to know if your script is running from a development location
>>(and therefore needs to call require() to set up dependencies),
>>couldn't you just check for 'MyPackage.egg-info' in the sys.path[0]
>>(script) directory?
>>
>>e.g.:
>>
>> import sys, os
>> if os.path.isdir(os.path.join(sys.path[0],"MyPackage.egg-info")):
>> from pkg_resources import require
>> require("MyPackage") # ensures dependencies get processed
>>
>>If this is what you want, perhaps we can create a standard recipe
>>in pkg_resources, like maybe 'script_package("MyPackage")', that
>>only does the require if you're a development egg and not being run
>>from run_main().
You didn't answer this, by the way.
>I'd be happy to advocate to / work with packagers once we get a basic
>set of best practices together. It seems like there are a lot of
>options here - we just need to iron out the details.
Yeah; I think that basically the best approach for packaging systems will
be to run EasyInstall during install and uninstall to modify the
easyinstall.pth file. I also think that if in Python 2.5 we can change the
bdist_* commands to create packages this way, then that should help, too.
More information about the Distutils-SIG
mailing list