[Distutils] Mystery solved

Jim Fulton jim at zope.com
Tue Jul 11 21:56:48 CEST 2006

On Jul 11, 2006, at 2:07 PM, Phillip J. Eby wrote:

> At 11:50 AM 7/11/2006 -0400, Jim Fulton wrote:
>> OK that's an interesting point wrt possible misspellings. If you can
>> find the package via the find links, but not via the index, that
>> seems to me to be a pretty good indication that this is not a
>> misspelling.  This is the case I'm worried about.  If the package
>> can't be found anywhere, then I agree that a warning is warranted.
> The interesting question there is, should the fallback scan still  
> take place in the absence of the warning?  If it *does* take place,  
> then the reason for the scan (and delay) is unexplained.  If it  
> does *not* take place, then there is an undesirable change in  
> semantics.
> Currently, if you have a package called "Bob's Incredible Package",  
> this will be treated by easy_install as being spelled "Bob-s- 
> Incredible-Package", and it will require a top-level index scan to  
> find the right URL.  It is also possible to have --find-links pages  
> containing obsolete versions, while PyPI contains the latest  
> version, so removing the scan doesn't seem to be a reasonable option.
> So, I will simply change the message to an "info" message stating  
> that the index page couldn't be found (rather than a warning  
> suggesting misspelling), *if* easy_install has previously seen at  
> least one valid distribution file or link for the applicable  
> project name.


>> The specific case, which I'll repeat from above, as clearly as I can,
>> is this:
>> - A user chooses not to store their software in an index.
>> - The user places distributions on a web server somewhere.  This is
>> just a directory, it is not a valid index.
>> - The user points at their server using find-links
>> - The user has an installation and they want to check for newer
>> versions.
>> - The distributions that they are looking for newer versions of can
>> be found on the server that they name via find-links.
>> In this case, they will get a warning that the distribution they are
>> looking for couldn't be found on the index.
> Okay, this scenario is fixed by changing to an info message as  
> described above.

Yup. Cool.

>>>   If you did that, however, it brings in the question of which of
>>> the --find-links URLs should be checked for a /projectname/
>>> subdirectory.  All of them?  Just the first one that finds a
>>> result?  None of them, if some other criterion is met?
>> I would stop when a result is found.
> Even so, this means O(N x M) web hits, where N is the number of  
> packages and M is the number of --find-links (including dependency  
> links supplied by eggs installed so far).  I don't think it's  
> reasonable to hit so many non-existent URLs on non-index servers,  
> and is impolite to the servers' operators.  (For example, if they  
> receive a daily report of all 404 errors from their web servers, as  
> I do.  This is pretty common on Red Hat boxes using logwatch, for  
> example.)
> It's particularly unfair since using e.g. http:// 
> peak.telecommunity.com/snapshots/ as a --find-links while  
> installing, say TurboGears, would cause a whole host of "index"  
> hits to subdirectories of that URL, even though none of them can or  
> will be found.
> The fallout from this approach is far worse than any "screen  
> scraping" issues we've had.

Isn't this the approach that's followed now?  Aren't all of the find- 
links searched as well as the index?  I suppose you're referring to  
the search for /projectname, which potentially doubles the number of  

>> What is the use case for spreading distributions over multiple
>> servers?  Do people really want to do that? I can see providing
>> multiple places to look, because different distributions might be on
>> different servers, but I don't see why distributions for a single
>> project should be spread over multiple servers.
> Platform-specific distributions may be provided by contributors to  
> a project, rather than by the project's author; see, for example,  
> Bob Ippolito's pages for distributing Mac OS X builds of popular  
> Python packages.  For this reason, you may have certain pages that  
> you always want included in your --find-links, to be checked in  
> addition to the normal indexes.



Jim Fulton			mailto:jim at zope.com		Python Powered!
CTO 				(540) 361-1714			http://www.python.org
Zope Corporation	http://www.zope.com		http://www.zope.org

More information about the Distutils-SIG mailing list