[Distutils] API for finding plugins
Phillip J. Eby
pje at telecommunity.com
Wed Feb 8 19:07:47 CET 2006
At 10:40 AM 2/8/2006 -0600, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>Probably the best thing to do is going to be to require searches to be
>>prioritized on input, e.g.:
>> find_resource(
>> ('resource', ['my_page']),
>> ('for_project', ['MyProject']),
>> ('layer', ['some_layer','other_layer']),
>> ('locale', ['en','de']),
>> ...
>
>Thoughts about the return value:
>
>I think there should be a dictionary returned, that contains various metadata.
Actually, there should be a find_resources() that yields resources
according to the precedence given, and find_resource() will simply return
the first resource.
> In this case it should contain at least {'resource': 'my_page',
> 'for_project': 'MyProject', 'layer': 'other_layer', 'locale': 'en'}, to
> represent exactly what was found.
There would actually be a Resource object, with either a mapping or
attribute interface to these things, as well as methods. The attributes
are going to be tuples of strings, however, since they can be multi-valued.
> Other metadata can be useful, like the resource_location (the actual
> filename or other user-readable name). Content-type is useful in many
> contexts -- for instance, you might want some kind of image, but then you
> need to know exactly what kind you got back. If it gets a stream,
> knowing Content-length is useful as well. Also, either encoding should
> be given (for text resources), or unicode returns should be allowed (I
> prefer unicode return values). I think the resource container usually
> knows the encoding, not the entity requesting the resource.
Content type would probably need to be mime_major+mime_minor attributes, so
you could request just 'image', without implementing more complex matches
for searching. Content length I don't see as a searchable attribute, but I
can see having a method of some sort to query that, and the same is true
for encoding.
>That dictionary could also be a container for callables that produce
>things like filenames and streams. Or they could be returned in a
>different way. Or it could be an actual object, with methods for those
>things. Somehow I have become fond of dictionaries.
Heh. I think objects are a reasonable way to go here; there isn't much
call for wrapping resources themselves with middleware, and they don't do
very much to begin with.
>I'm starting to get a better feel for how this overlaps with templates --
>coming at it from this direction is easier than from the WSGI direction.
So far the biggest architectural flaw (efficiency-wise) that I see in all
this is that if your resources are all in eggs, and you have a *lot* of
them, you have to read *all* of the eggs' resource indexes before you can
return a single match. While it's true that you could have a shorter list
for each egg that indicates only what attribute/value combinations are
offered by that egg, you still have to read *that* list for all of them for
the first search, and for eggs with a small number of resources it'll be
almost as fast to just read the full index to avoid multiple I/O operations.
Anyway, conceptually I think this is something that's useful for pretty
much any extensible, localizable Python application, especially ones that
are web-based. I can see many potential implementations for how you get
the resource data *in* to the system, too. For example, peak.web already
has an .ini file format that lets you set content type rules, using section
headings that give filenames or wildcards, and then the entries list
properties to be assigned.
I'm thinking that on the egg side, I'd use a new setuptools entry point for
"resource finder" plugins. Their job will be to scope out the distribution
source for resources and add them to an index. The index would then be
written to the egg's metadata directory. A 'resource_finders' keyword to
setup would list the names of the entry points to use, so that you don't
have every possible resource finder chugging away and adding false
positives. It might be that the keyword would be a dictionary, e.g.:
setup(
...
publish_resources = {
'peak.web': ['somepkg/foo', ...],
'chandler.translations': {'someparcel':'foobar/LC_MESSAGES'},
...
}
)
That is, 'publish_resources' would be a dictionary mapping entry point
names to arguments that define how those plugins should locate and index
the resources. The above would fire the peak.web and chandler.translations
plugins from the 'setuptools.resource_finders' entry point group in order
to index the available resources for publication. (Notice that each
resource finder can have a different parameter format, if it likes.)
Hm. After all this talk about the thing, I kind of want to just go
implement it so *I* can use it, standards be damned. :) OTOH, I think it
would be really good to get more feedback on the concept before doing
that. I'd be especially curious to hear from the framework developers,
esp. of Zope, TurboGears, and Myghty. Zope of course already has similar
resource-finding abilities, but they don't tie to eggs. They should have
good feedback about both what you need to be able to search on, and what
kind of performance issues are likely to arise.
More information about the Distutils-SIG
mailing list