[Python-Dev] setuptools: past, present, future
Phillip J. Eby
pje at telecommunity.com
Sat Apr 22 18:39:51 CEST 2006
At 05:41 PM 4/22/2006 +1000, Nick Coghlan wrote:
>Phillip J. Eby wrote:
>>At 12:22 AM 4/22/2006 -0400, Terry Reedy wrote:
>>>Why can't you remove the heuristic and screen-scrape info-search code
>>>from the easy_install client and run one spider that would check
>>>new/revised PyPI entries, search for missing info, insert it into PyPI when
>>>found (and mark the entry eggified), or email the package author or a human
>>>search volunteer if it does not find enough?
>>I actually considered that at one point. After all, I certainly have the
>>technology.
>>However, I didn't consider it for more than 10 seconds or so. Package
>>authors have no reason to listen to some random guy with a bot -- but
>>they do have reasons to listen to their users, both actual and potential.
>
>I'm not sure that's what Terry meant - I took it to mean *make the spider
>part of PyPI itself*.
Which would also be accomplished by using Grig's Cheesecake tool, since it
uses easy_install to fetch the source.
>Then all the heuristics and screen-scraping would be server-side - all
>easy_install would have to do is look at the meta-data provided by the
>PyPI spider.
Which is certainly attractive from the POV of being able to make changes
quickly.
However, I forgot to mention another issue, because I was speaking from the
point of view of the time when I designed the thing, not the present
day. After it was implemented, it has turned out that being able to point
easy_install to web pages with a specific collection of packages (e.g. ones
built for a specific OS version, or that are tested for a particular
purpose, etc.) is *very* useful in practice. And the people who are doing
that, are just going to do whatever it takes to make their listing(s) work
with easy_install, because that's the whole point for them. So there
doesn't have to be unlimited growth of heuristics there.
What it basically amounts to, then, is that easy_install heuristics
currently only have to chase people who aren't trying to easy_install their
packages. For example, I discovered the other day that easy_install can
get confused by bdist_dumb distributions. So few people ever distribute
bdist_dumb packages that I never ran into that as an issue before now. So
I had to update the heuristics to be able to tell from the filename whether
a package is likely to be a bdist_dumb.
However, if PyPI is doing Cheesecake ratings, there will only be a finite
number of such things to deal with, because when people make changes that
break their ratings, they'll just fix the problem themselves, as it'll
generally be faster than lobbying for new heuristics in easy_install. As
the community becomes better educated about making their package links easy
to find, the amount of maintenance work needed for easy_install should drop
off. Right now, the main reason to add heuristics is to increase
compatibility with whatever practices are already out there, in order to
leverage the greatest number of existing packages to secure the greatest
number of users.
More information about the Python-Dev
mailing list