[Distutils] API for finding plugins

Wed Feb 8 18:19:41 CET 2006

At 01:10 AM 2/8/2006 -0600, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>Probably.  ;)  A fix, however, would be to change your entry point names 
>>to include a description.  Currently, entry point names can contain any 
>>characters you like besides  '='.  (And leading/trailing spaces are skipped.)
>>This means that you can define entry points like this, as long as you do 
>>the name parsing in your own tools:
>>    commit (ci,checkin) - Commit the current version = some.module:commit
>
>Somehow this bothers me.  In practice I guess this gives me all the data I 
>want... and yet I don't really feel confident it gives me that data. And 
>it also feels like the entry point becomes unstable at a certain point.

Um, okay.  Those sound like personal issues that I can't help you 
with.  ;)  Entry point names were designed with this kind of flexible 
processing in mind, however.  You can embed as much data as you want in the 
name, as long as there are no line breaks or '=', and the name doesn't 
start with a '#'.

>I'd very much like to get the actual location of the resource as well. 
>This is what should show up in debugging output.

That should be done in the resource object's __repr__, then.  But I can see 
having an attribute that's effectively some str() of the object's 
location.  (By which I mean it's human readable, but not machine-usable, 
since the resource provider could be a file, a database, or who knows what.)

>   Also, logging of some sort should go in early, I think, as it should be 
> expected that resource resolution will have to be debugged, and the only 
> way I imagine it being debuggable is through log messages.

You lost me on that.  I'm having trouble seeing what you'll be able to (or 
need to) debug that way.  The most likely problems are that you misspell 
something (in which case there'll be no matches at all) or you're missing a 
provider (in which case studying the list of the providers will give you 
the answer).  In each case, inspecting the current state of the system 
seems sufficient.

OTOH, as long as the interfaces are well-defined, nothing stops you from 
creating logging "middleware" for debugging purposes, I suppose.  I just 
don't want to embed that stuff in core code, if for no other reason than 
that the logging module is a PITA.

>OK, that's probably all I mean.  I just don't want someone to get 
>"/images/", and then look inside their for "plus.gif", and get a not found 
>error because it looks inside the directory named /images/ that was found 
>which only contains a subset of files.

Oh, no...  I'm assuming that the attribute namespaces are entirely 
flat.  If you want directories or some other kind of hierarchy you would 
need to simulate them using 'directory' attributes or something like 
that.  I'm assuming also that an individual resource can have more than one 
value for an attribute, so that a single resource could be e.g. registered 
for both 'en' and 'en-US' locales.

>>Efficiently handling search precedence across multiple resource providers 
>>is also an interesting problem.  You really want the result precedence to 
>>be based on stuff like the locale and layers, *not* on which provider 
>>found the data.  This means that searches like the example I gave have to 
>>either be broken down into a variety of single-value searches done in 
>>sequence, each one executed in "parallel" against all backends.  Either 
>>that, or there has to be a kind of sort-merge done against results 
>>yielded by the backends to ensure that the "best" results are yielded first.
>>Probably the best thing to do is going to be to require searches to be 
>>prioritized on input, e.g.:
>>     find_resource(
>>        ('resource', ['my_page']),
>>        ('for_project', ['MyProject']),
>>        ('layer', ['some_layer','other_layer']),
>>        ('locale', ['en','de']),
>>     ...
>>The above is saying, "first look for resources named my_page that are for 
>>MyProject, and of those you find, give precedence to 'some_layer' over 
>>'other_layer' ones.  And within those, give precedence to locales of 'en' 
>>then 'de'.
>>This approach has the benefit of allowing entire backends to be excluded 
>>early from the search, since it doesn't matter what layers or locales an 
>>egg has resources for if it doesn't have 'my_page' for 'MyProject'.
>
>You wouldn't have to order the criteria to deal with that, since it seems 
>like a fairly easy query optimization

"Easy query optimization" is usually an oxymoron.  :)

>  to see that since there was only one option given for each of resource 
> and for_project, it must be satisfied regardless of precedence.

True.  On the other hand, for the use cases where this is true, you'll 
usually statically know that at the call point.  And, for the ones that 
have multiple values, following the order given is critical.  But that's 
probably a matter for tuning of an *actual* implementation, rather than at 
the design level.  :)

>   Unless there was some kind of wildcard that a resource could 
> provide.  Like providing "my_page" for any project.  I don't expect there 
> to be such a wildcard...?

I don't either.  Multiple values for an attribute, yes.  Wildcard 
attributes, not really, at least not as part of the system 
itself.   Wildcards can be simulated by either omitting a value from a 
search, or having an explicit wildcard value that's always searched last in 
a list.