[Web-SIG] A more useful command-line wsgiref.simple_server?

Mon Apr 2 06:54:36 CEST 2012

On Fri, Mar 30, 2012 at 2:20 PM, Masklinn <masklinn at masklinn.net> wrote:

> On 2012-03-30, at 20:22 , Sasha Hart wrote:
> >
> > I am finding more reasons to dislike that -m:
> >
> >    python -m wsgiref.simple_server -m blog app
> >
> > Beyond looking a little stuttery, it's really unclear. Anyone could be
> > forgiven for thinking that -m meant the same thing in both cases
>
> And it does. In both case it means "use this module for execution". Hence
> `-m`, as a shorthand for `--module`.
>

I don't agree, there are real differences. python -m runs e.g. what is in
__main__.py and this is effectively a script: the interface is defined by
things like sys.argv, environment variables, exit code, stdin/stdout.
yourtool -m asks a module to put a callable into a namespace, the name of
the callable must be known to decide which callable to use. a simple http
server must run and repeatedly call the callable according to WSGI
convention, not a script interface. Typically there is some kind of event
loop, etc. The things 'executed' are different and they are 'executed' in
quite different ways. So while you can use -m (or -xyz or -rm-rf) I have
the opinion that it's a little more confusing and a little more long-winded
than makes sense to me as a clean one-liner for newbies, partially because
I think it suggests a misleading parallel.

But this is just feedback; if you don't like it, then just get more
feedback. My vote is for whatever gets consensus as easiest for users who
have a use for the thing. If you get a patch in, great. If I don't like it
then I will just recommend that (still pretty trivial, conceptually clear)
wsgiref recipe or gunicorn or whatever has nailed the extreme simplicity
required for this case where you are avoiding a 'real server' deploy.

> so I see no gain of understanding by reusing the convention. python -m
> doesn't take a
> > second positional argument, either.
>
> I'm not sure why that would matter.
>

What it could matter (for what it's worth) is that two different calling
conventions is more for a newbie to keep straight, for a '-m' which is
supposed to mean one thing.

>
> > You can't write '-m blog app -m
> > wsgiref.simple_server' or '-m blog -m wsgiref.simple_server blog'
>
> Naturally, because this makes no sense at all, the tool being invoked
> to start the server is all of `python -mwsgiref.simple_server`. But
> that's the very basics of the -m option.
>

It is only natural for people who have strong domain knowledge of Python,
for whom the proposed feature is redundant. For people who don't care about
Python or are just starting, it is not safe to assume detailed knowledge of
the -m option to the python executable. Many utilities allow the same flag
to be repeated to supply multiple arguments, for example; the
interpretation of -m and the rest of the command line here is not
unprecedented, but also not just obvious to every bash user. If this is
easy, how hard is it to paste in the wsgiref recipe or run gunicorn? Take
it or leave it, my input is that this has no reason to exist in stdlib
unless it is at least as simple. But as I said before, if you get a patch
in then that's great.

I fail to see why that would be any more troubling than
> `python -mcalendar -m 6`. Or require any more specifics.
>

I think 'draw six calendars across' is not at all confusable with 'run the
calendar module as a script' because that utility is so simple. The one
under discussion is at least somewhat more complex, since you are asking
one module to run as a script which imports another module and picks out a
callable and starts an HTTP/WSGI server. We can keep track of that without
any trouble, but we are also subscribers to Web-SIG.

I will likely never use that calendar one-liner in my life, nor recommend
it to any newbie I am trying to help. Whether it is good or bad, it matters
very little. If you get a patch into stdlib with really simple syntax then
I will evangelize it right after 'hello world'. If not, there are other
options which are also okay.

>
> > On reflection, I feel strongly that a module name should be the default
> > positional arg, not a filename. I agree with PJ Eby that pointing
> directly
> > at a file encourages script/module confusion. I would add that it
> > encourages hardcoding file paths rather than module names, which is
> brittle
> > and not good for the WSGI world (for example, it bypasses virtualenvs and
> > breaks any time a different deploy directory structure is used).
>
> Not sure how that makes sense, it uses the Python instance and site-package
> in the virtualenv, there is nothing to bypass.
>

The problem is that you get scripts coded directly against
Z:\WORK\FROB\FROB2\FARPLE.PY instead of correctly searching for farple on
PYTHONPATH (thus separating the concerns of how/where to install farple
from ones of how to get it). If you write a script which hardcodes module
paths, it is asking for exactly the same file regardless of what virtualenv
you are in.

I fully understand why the python interpreter takes file arguments. In web
apps we are not talking about just running a file through the interpreter
any more. WSGI doesn't involve 'running' scripts, but rather importing
modules which contain WSGI application callables. Here we are talking about
how to tell python to import a file by its filename, when Python itself
already provides a clearly better way of finding things. Now I reckon you
are seasoned and you know about those subtleties, but if I am just
switching from PHP and you teach me to specify this with file paths then
that is setting me up for frustration down the line, when I either have to
deal with correct imports or I hit the wall of the 'import from py files'
approach.

>
> > Of course, this also means no '-m'. Then the typical use case is really
> just
> >
> >    python -m wsgiref.simple_server blog
> >
> > A second positional arg is both a new convention and not an explicit one,
> > where I would prefer either an existing implicit convention or a new
> > explicit one.
>
> What is not explicit in having an explicit argument, that it's a positional
> one instead of an option? How is a colon "more explicit"?
>

Well, that isn't what I said... I said that I would prefer EITHER an
existing implicit convention (e.g. the widely used colon, which also made
sense to me from the moment I saw it) OR an explicit convention (e.g. not
allowing a second positional argument, only a labeled argument if a second
argument were needed). The explicit one is wordy but discoverable. The
implicit one is clean and allows me to transfer knowledge from myriad other
tools using the same convention. The second positional arg has the
disadvantages of both - neither discoverable, as it lacks a label nor does
it benefit from sharing a convention with other tools (the latter has a
little relevance to me when I wish to teach others, as well).

This is why I was -1 on the second positional arg, and +1 on EITHER of the
other two options, which have different pluses and minuses.

>
> > I think PJ Eby is right that the colon convention is only for modules,
> and
> > I think following gunicorn's lead here would result in a nicer interface
> > than forcing (say) --module
> >
> >    python -m wsgiref.simple_server blog:app
>
> The colon is no more explicit than a second positional argument. In fact,
> it is significantly less so since it can not be separately and clearly
> documented and the one positional parameter needs to document its parsing
> rules instead.
>

I am sure that is slightly easier to write, anyway. This attribution of
'explicit' to the colon convention is probably a misunderstanding; I never
said that (although I find it strange to say that the second positional arg
is in any way more explicit, particularly when it is inventing a convention
for referring to WSGI callables). The virtue of the colon syntax for me is
that it fully covers the case which needs to be supported without any
--args (simple, clean looking) and while allowing transfer of knowledge
between this and just about every other tool out there which names WSGI
callables. It's close to the simple Python notation for referring to the
object - perhaps it could be better though?

>
> > If there is a need to point at a filename, I agree that it should be done
> > explicitly.
> >
> >    python -m wsgiref.simple_server --file=~/app.py
> >
> > (or whatever the flag should be called). To me this seems like a small
> cost
> > to allow the colon by default without possibility of confusion or overly
> > fancy parsing.
>
> 1. It also does not work considering you can't specify the application's
>   name in that scheme, so piling on yet more complexity would be
>   required and there would be two completely different schemes for
>   specifying an application name. I don't find this appealing.
>

I guess the problem is that you are indirectly specifying the namespace,
via a filename to be imported, and then directly specifying the name. All
this (including cross-platform issues) is so much easier if you just use
Python's conventions for finding and referencing modules and names in
namespaces. But for whatever reason, you must import Python modules by
filename (something I'll never do again, for reasons already laid out). Yet
WSGI requires a name to be specified, so you end up with some kind of
awkward hybrid specification. I certainly do not suggest trying to invent a
new convention, but if you want to then you just need to pick a portable
delimiter for the command line. Maybe a second positional arg or --app= for
the cases where you are importing from a file and do not want 'application'
and must have a one-liner rather than setting up the project. I have no
reason to care because I will never use or teach this case.

I think this is the interesting question: what convention already exists in
Python for naming a particular object in an imported module?

> 2. You seem to have asserted from the start that the default should be
>   mounting modules, but I have seen no evidence or argument in favor of
>   that so far.
>

In a way you do not have any choice but to 'mount modules': WSGI does not
provide a mechanism to 'run a program' without picking a callable Python
object out of a namespace, implying an import.

My intention was to offer feedback, on the assumption that the idea was
being thrashed out publicly. I believe I actually have offered some
supporting reasoning for why I like the module path better as a default,
even though you reject it. But ultimately, much of this is or verges on
bikeshedding. Now our exchange has taken on a negative tone that I would
not have chosen. I have written to clarify some misunderstandings, but I
will not bother offering further unwanted feedback. I hope you will look
for more input and develop a better consensus.

  Defaulting to scripts not only works with both local modules and
>   arbitrary files and follow cpython's (and most tools's) own behavior,
>   but would also allows using -mwsgiref.simple_server as a shebang
>   line. I find this to have quite a lot of value.
>

The python interpreter runs scripts, i.e. processes which interface with
sys.argv and stuff like that. So of course it needs to take file paths. But
imports are not typically done with file paths, and WSGI app-finding is
either imperatively about importing or declaratively about Python
namespaces - not file paths or scripts, as in CGI. Python imports are
conventionally done by a path like frob.bar rather than a filename, which
has problems (I think this is why we don't "import c:\work\frob\farple.py"
at the top of our files, or run "python -m /usr/lib/python3.2/wsgiref.py
simple_server". You must also specify a name for the app object, and it
happens that the conventions around specifying names in namespaces dovetail
pretty closely with the conventions around imports. (e.g.: from frob import
app, import frob.app). Given that the operation required is an import and
then a lookup, it seems more natural to me to use Python notation or
something trivially related to it, rather than OS-specific filesystem
notation. But I am sure this is just another disagreement and that's fine
with me.

I would personally not +x a module file just to serve an app with wsgiref
from the hashbang line; it's clever but I can't come up with any real
benefit. A case where I'm serving with wsgiref already has to be pretty
trivial and I'm not going to couple to it *from inside the module itself*
when it is so darned easy to just run the module from several nice python
test servers (also portable and I can use autoreload, etc.) But if this is
desired by many others, I'd agree it's a good factor to consider.

Cheers
Sasha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20120401/c2c69cb4/attachment-0001.html>