[Web-SIG] Bill's comments on WSGI draft 1.4
Bill Janssen
janssen at parc.com
Thu Sep 2 03:07:19 CEST 2004
Well, thanks to Andrew's comment about my non-participation, I've
finally read PEP 333, version 1.4, and have a few comments.
Phillip, great job, nice reasoning. I like the general design. I
think the project as a whole is quite useful.
I've been using a custom framework together with Medusa, and as I read
I tried to imagine how my framework could be implemented under WSGI.
There seem to be no show-stoppers, though I have yet to try it.
A meta comment on commenting on PEP drafts: Without numbered sections,
paragraphs, and lines, there's no effective way to point back to
specific wording in the draft without quoting it.
A few nits about WSGI:
1. The "environ" parameter must be a Python dict: I think subclasses
should be allowed. A true subclass supports all methods of its
ancestors, so the rationale presented in the back of the PEP for
excluding them doesn't hold water. I think the appropriate check
would be to see if the returned class is a subclass of the "dict"
class. That is, "isinstance(e, dict)" should return True.
2. The "fileno" attribute on the returned iterable. I'm a bit
concerned about using operating system file descriptors, due to
resource constraints; I think a better check would be to see if the
returned iterable is a subclass of the "file" class. That is,
"isinstance(f, file)" should return true.
3. Comments about "The [status-line] string must be 7-bit
ASCII...containing no control characters." That's overly restrictive;
I think it would be better to simply refer to RFC 2616 and say that it
should follow the rules defined there for "Reason-Phrase".
4. Similarly, the rules about header values are more restrictive than
HTTP; they therefore prevent perfectly valid HTTP header values from
being returned. That's bad. Again, I think the PEP should simply
refer to RFC 2616 and say, "Use those rules".
5. The phrase about "if a server or gateway discards or overrides any
application header for any reason, it must record this in a log"; that
should be "should" instead of "must". Otherwise you'll have your log
cluttered with innocuous header re-write messages, and no way to turn
that off.
6. The "write()" callable is important; it should not be deprecated
or in some other way made a poor stepchild of the iterable.
7. If an application returns an iterable after calling write(), are
the strings produced by iteration written after those written by calls
to write?
8. The note on Unicode: Unfortunately, Web standards like HTTP rely
on using proper character sets. By *not* using Unicode strings, and
by *not* specifying the character set encoding of the "raw" byte
strings, we open the door for disastrous misunderstandings. The
safest thing to do would be to require the framework to traffic in
Unicode strings for things like header values, which the WSGI
middleware would translate to or from the various required encodings
used by the server and external protocols. At least with Unicode
strings you know what encoding is being used.
A riskier, more error-prone option would be to require the byte
strings to be in particular encodings.
The content strings, those written to the "write()" calls, or returned
by the iterable, should in fact be byte vectors, exactly as they are
currently specified.
9. There should be a non-optional way of indicating the URL scheme,
whether it is "http", "https", or "ftp". I'd suggest "wsgi.scheme" in
the environ.
Bill
More information about the Web-SIG
mailing list