[Web-SIG] WSGI & CGI spec
Ian Bicking
ianb at colorstudy.com
Sun Dec 17 19:21:37 CET 2006
Reading the CGI spec I'm noticing some requirements it makes that aren't
done as much in WSGI.
In particular:
6.1.8. QUERY_STRING
A URL-encoded string; the <query> part of the Script-URI. (See
section 3.2.)
QUERY_STRING = query-string
query-string = *uric
The URL syntax for a query string is described in section 3 of
RFC 2396 [4].
Servers MUST supply this value to scripts. The QUERY_STRING
value is case-sensitive. If the Script-URI does not include a
query component, the QUERY_STRING metavariable MUST be defined
as an empty string ("").
Notice that it is a MUST. We've already discussed that the cgi module
acts weird if you don't provide this; but we can go further and say that
the cgi module is right (well, except for its weird behavior otherwise)
and anything not providing this is incorrect.
Notably SCRIPT_NAME and PATH_INFO are always required to be present
(even when empty). REMOTE_ADDR and SERVER_SOFTWARE are also required
values.
Lastly it was noted to me that SCRIPT_NAME and PATH_INFO are supposed to
be decoded (in the spec it says "The syntax and semantics are similar to
a decoded HTTP URL 'path' token (defined in RFC 2396 [4])"). I haven't
been doing this, and the spec isn't clear on this (wsgiref does do this
decoding, as does Apache). This is to say, when you request
'/foo%20bar/', PATH_INFO should be '/foo bar/'.
It's also unclear if the WSGI server is expected to normalize the path,
specifically things like /foo/../bar -- Apache does do this, wsgiref
does not. (Is posixpath.normpath good enough to do that?)
Anyway, mostly I think we need some clarifications in the spec. Though
the normalization would add an additional requirement to the spec.
--
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org
More information about the Web-SIG
mailing list