[Distutils] API for finding plugins

Ian Bicking ianb at colorstudy.com
Wed Feb 8 08:10:54 CET 2006

Phillip J. Eby wrote:
> At 09:40 PM 2/7/2006 -0600, Ian Bicking wrote:
>> Phillip J. Eby wrote:
>>> I'm assuming here that the problem is needing to import each command 
>>> to get its description and display it?
>> Oh, yes, that too ;)  That probably is the bigger problem, and 
>> inevitable.  That doesn't happen except with help.  So maybe I am 
>> worried about nothing.
> Probably.  ;)  A fix, however, would be to change your entry point names 
> to include a description.  Currently, entry point names can contain any 
> characters you like besides  '='.  (And leading/trailing spaces are 
> skipped.)
> This means that you can define entry points like this, as long as you do 
> the name parsing in your own tools:
>    commit (ci,checkin) - Commit the current version = some.module:commit

Somehow this bothers me.  In practice I guess this gives me all the data 
I want... and yet I don't really feel confident it gives me that data. 
And it also feels like the entry point becomes unstable at a certain point.

Of course, if the help was a resource instead of a Python object...

>> Are resources typed in any way?  Similar to entry point groups...?
> I think resources should only provide access to string/stream/filename 
> (ala pkg_resources resources) and their metadata attributes (like 
> locale, layer, etc.)  If you want to have more elaborate typing, you can 
> simply use another attribute to define it.  For example, a content_type 
> attribute or an attribute that says what entry point to use to adapt the 
> resource to some interface.

I'd very much like to get the actual location of the resource as well. 
This is what should show up in debugging output.  Also, logging of some 
sort should go in early, I think, as it should be expected that resource 
resolution will have to be debugged, and the only way I imagine it being 
debuggable is through log messages.

>>> Of course, this would in most cases be wrapped by some higher-level 
>>> API that eliminates most of the parameters from needing to be 
>>> specified (e.g. by a framework that knows what locales and skin 
>>> layers are in effect and what project the requesting code is calling 
>>> from).  For performance, you could extract subset resource sets and 
>>> use them instead of querying a top-level resource set.
>> I often find myself wanting to just override one little bit.  Subset 
>> resources would potentially break that, unless they are a subset that 
>> is resolved all at once instead of a subset that has to provide all 
>> the necessary resources.  At least if you are describing what I think 
>> you are describing.
> I'm not sure I follow you.  If it's something you "override", then you'd 
> have to leave it out of your subset criteria.  What I'm describing is 
> the ability to have a subset snapshot for performance reasons, not 
> simply a restricted view over a larger set.  (Although that also sounds 
> like a useful thing to have.)

OK, that's probably all I mean.  I just don't want someone to get 
"/images/", and then look inside their for "plus.gif", and get a not 
found error because it looks inside the directory named /images/ that 
was found which only contains a subset of files.

> Mainly, my concerns about this approach are that, without tuning or 
> hinting for a particular access profile, it's going to be tough to have 
> a fast data structure that's also memory-efficient.  Creating indexes on 
> all attributes means a space consumption of roughly one dictionary or 
> set per unique attribute value.  That is, every relatively-unique key 
> consumes a dictionary of its own, consuming hundreds of bytes.  Every 
> resource will have at least 1 relatively-unique attribute value, namely 
> its ID.
> On the plus side, even as the total number of resources grows due to 
> variants, the raw overhead for the dictionaries should remain the same, 
> since each new language or layer will only add one new unique key (the 
> language or layer).  So, it's probably not as bad to just index 
> everything as I'm worrying it would be.

OK, I will trust you on efficiency ;)

> Efficiently handling search precedence across multiple resource 
> providers is also an interesting problem.  You really want the result 
> precedence to be based on stuff like the locale and layers, *not* on 
> which provider found the data.  This means that searches like the 
> example I gave have to either be broken down into a variety of 
> single-value searches done in sequence, each one executed in "parallel" 
> against all backends.  Either that, or there has to be a kind of 
> sort-merge done against results yielded by the backends to ensure that 
> the "best" results are yielded first.
> Probably the best thing to do is going to be to require searches to be 
> prioritized on input, e.g.:
>     find_resource(
>        ('resource', ['my_page']),
>        ('for_project', ['MyProject']),
>        ('layer', ['some_layer','other_layer']),
>        ('locale', ['en','de']),
>     ...
> The above is saying, "first look for resources named my_page that are 
> for MyProject, and of those you find, give precedence to 'some_layer' 
> over 'other_layer' ones.  And within those, give precedence to locales 
> of 'en' then 'de'.
> This approach has the benefit of allowing entire backends to be excluded 
> early from the search, since it doesn't matter what layers or locales an 
> egg has resources for if it doesn't have 'my_page' for 'MyProject'.

You wouldn't have to order the criteria to deal with that, since it 
seems like a fairly easy query optimization to see that since there was 
only one option given for each of resource and for_project, it must be 
satisfied regardless of precedence.  Unless there was some kind of 
wildcard that a resource could provide.  Like providing "my_page" for 
any project.  I don't expect there to be such a wildcard...?

Though I can almost imagine more complex search rules.  But I'm having a 
hard time coming up with one.  I thought I had one, then it slipped away...

Ian Bicking  |  ianb at colorstudy.com  |  http://blog.ianbicking.org

More information about the Distutils-SIG mailing list