[Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config
Ian Bicking
ianb at colorstudy.com
Sun Jul 24 21:12:02 CEST 2005
Phillip J. Eby wrote:
>> It goes in the .egg-info directory. This way elsewhere you can say:
>>
>> application = SomeApplication[feature1]
>
>
> I like this a lot, although for a different purpose than the format
> Chris and I were talking about.
Yes, this proposal really just simplifies a part of that application
deployment configuration, it doesn't replace it. Though it might make
other standardization less important.
> I see this fitting into that format as
> maybe:
>
> [feature1 from SomeApplication]
> # configuration here
>
>
>> And it's quite unambiguous. Note that there is *no* "configuration" in
>> the egg-info file, because you can't put any configuration related to a
>> deployment in an .egg-info directory, because it's not specific to any
>> deployment. Obviously we still need a way to get configuration in
>> there, but lets say that's a different matter.
>
>
> Easily fixed via what I've been thinking of as the "deployment
> descriptor"; I would call your proposal here the "import map".
> Basically, an import map describes a mapping from some sort of feature
> name to qualified names in the code.
Yes, it really just gives you a shorthand for the factory configuration
variable.
> I have an extension that I would make, though. Instead of using
> sections for features, I would use name/value pairs inside of sections
> named for the kind of import map. E.g.:
>
> [wsgi.app_factories]
> feature1 = somemodule:somefunction
> feature2 = another.module:SomeClass
> ...
>
> [mime.parsers]
> application/atom+xml = something:atom_parser
> ...
I assume mime.parsers is just a theoretical example of another kind of
service a package can provide? But yes, this seems very reasonable, and
even allows for loosely versioned specs (e.g., wsgi.app_factories02,
which returns factories with a different interface; or maybe something
like foo.configuration_schema, an optional entry point that returns the
configuration schema for an application described elsewhere).
This kind of addresses the issue where the module structure of a package
becomes an often unintentional part of its external interface. It feels
a little crude in that respect... but maybe not. Is it worse to do:
from package.module import name
or:
name = require('Package').load_entry_point('service_type', 'name')
OK, well clearly the second is worse ;) But if that turned into a
single function call:
name = load_service('Package', 'service_type', 'name')
It's not that bad. Maybe even:
name = services['Package:service_type:name']
Though service_type feels extraneous to me. I see the benefit of being
explicit about what the factory provides, but I don't see the benefit of
separating namespaces; the name should be unambiguous. Well... unless
you used the same name to group related services, like the configuration
schema and the application factory itself. So maybe I retract that
criticism.
> In addition to specifying the entry point, each entry in the import map
> could optionally list the "extras" that are required if that entry point
> is used.
> It could also issue a 'require()' for the corresponding feature if it
> has any additional requirements listed in the extras_require dictionary.
I figured each entry point would just map to a feature, so the
extra_require dictionary would already have entries.
> So, I'm thinking that this would be implemented with an entry_points.txt
> file in .egg-info, but supplied in setup.py like this:
>
> setup(
> ...
> entry_points = {
> "wsgi.app_factories": dict(
> feature1 = "somemodule:somefunction",
> feature2 = "another.module:SomeClass [extra1,extra2]",
> ),
> "mime.parsers": {
> "application/atom+xml": "something:atom_parser
> [feedparser]"
> }
> },
> extras_require = dict(
> feedparser = [...],
> extra1 = [...],
> extra2 = [...],
> )
> )
I think I'd rather just put the canonical version in .egg-info instead
of as an argument to setup(); this is one place where using Python
expressions isn't a shining example of clarity. But I guess this is
fine too; for clarity I'll probably start writing my setup.py files with
variable assignments, then a setup() call that just refers to those
variables.
>> Open issues? Yep, there's a bunch. This requires the rest of the
>> configuration to be done quite lazily.
>
>
> Not sure I follow you; the deployment descriptor could contain all the
> configuration; see the Web-SIG post I made just previous to this one.
Well, when I proposed that the factory be called with zero arguments,
that wouldn't allow any configuration to be passed in.
>> I don't think
>> this is useful without the other pieces (both in front of this
>> configuration file and behind it) but maybe we can think about what
>> those other pieces could look like. I'm particularly open to
>> suggestions that some_function() take some arguments, but I don't know
>> what arguments.
>
>
> At this point, I think this "entry points" concept weighs in favor of
> having the deployment descriptor configuration values be Python
> expressions, meaning that a WSGI application factory would accept
> keyword arguments that can be whatever you like in order to configure it.
Yes, I'd considered this as well. I'm not a huge fan of Python
expressions, because something like "allow_hosts=['127.0.0.1']" seems
unnecessarily complex to me. As a convention (maybe not a requirement;
a SHOULD) I like if configuration consumers handle strings specially,
doing context-sensitive conversion (in this case maybe splitting on ','
or on whitespace). It would make me sad to see a something accept
requests from the IP addresses ['1', '2', '7', '.', '0', '.', '0', '.',
'1']. This is the small sort of thing that I think makes the experience
less pleasant.
> However, after more thought, I think that the "next application"
> argument should be a keyword argument too, like 'wsgi_next' or some
> such. This would allow a factory to have required arguments in its
> signature, e.g.:
>
> def some_factory(required_arg_x, required_arg_y, optional_arg="foo",
> ....):
> ...
>
> The problem with my original idea to have the "next app" be a positional
> argument is that it would prevent non-middleware applications from
> having any required arguments.
I think it's fine to declare the next_app keyword argument as special,
and promise (by convention) to always pass it in with that name.
> Anyway, I think we're now very close to being able to define a useful
> deployment descriptor format for establishing pipelines and setting
> options, that leaves open the possibility to do some very sophisticated
> things.
>
> Hm. Interesting thought... we could have a function to read a
> deployment descriptor (from a string, stream, or filename) and then
> return the WSGI application object. You could then wrap this in a
> simple WSGI app that does filesystem-based URL routing to serve up
> *.wsgi files from a directory. This would let you bootstrap a
> deployment capability into existing WSGI servers, without them having to
> add their own support for it! Web servers and frameworks that have some
> kind of file extension mapping mechanism could do this directly, of
> course. I can envision putting *.wsgi files in my web directories and
> then configuring Apache to run them using either mod_python or FastCGI
> or even as a CGI, just by tweaking local .htaccess files. However, once
> you have Apache tweaked the way you want, .wsgi files can be just
> dropped in and edited.
Absolutely; I see no reason WSGI servers should have any dispatching
logic in them, except in cases when they also dispatch to non-Python
applications (like Apache). So it seems natural that we present
deployment as a single application factory that takes zero or one arguments.
> Of course, there are still some open design issues, like caching of
> .wsgi files (e.g. should the file be checked for changes on each hit? I
> guess that could be a setting under "WSGI options", and would only work
> if the descriptor parser was given an actual filename to load from.)
I don't know what we'd do if we checked the file and found it wasn't up
to date. In this particular case I suppose you could reload the
configuration file, but if the change in the configuration file
reflected a change in the source code, then you're stuck because
reloading in Python is so infeasible. I'm all for warnings, but I don't
see how we can do the Right Thing here, as much as I wish it were otherwise.
--
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Web-SIG
mailing list