[Web-SIG] WSGI & CGI spec

Ian Bicking ianb at colorstudy.com
Sun Dec 17 19:21:37 CET 2006


Reading the CGI spec I'm noticing some requirements it makes that aren't 
done as much in WSGI.

In particular:

6.1.8. QUERY_STRING

    A URL-encoded string; the <query> part of the Script-URI. (See
    section 3.2.)

     QUERY_STRING = query-string
     query-string = *uric

    The URL syntax for a query string is described in section 3 of
    RFC 2396 [4].

    Servers MUST supply this value to scripts. The QUERY_STRING
    value is case-sensitive. If the Script-URI does not include a
    query component, the QUERY_STRING metavariable MUST be defined
    as an empty string ("").


Notice that it is a MUST.  We've already discussed that the cgi module 
acts weird if you don't provide this; but we can go further and say that 
the cgi module is right (well, except for its weird behavior otherwise) 
and anything not providing this is incorrect.

Notably SCRIPT_NAME and PATH_INFO are always required to be present 
(even when empty).  REMOTE_ADDR and SERVER_SOFTWARE are also required 
values.

Lastly it was noted to me that SCRIPT_NAME and PATH_INFO are supposed to 
be decoded (in the spec it says "The syntax and semantics are similar to 
a decoded HTTP URL 'path' token (defined in RFC 2396 [4])").  I haven't 
been doing this, and the spec isn't clear on this (wsgiref does do this 
decoding, as does Apache).  This is to say, when you request 
'/foo%20bar/', PATH_INFO should be '/foo bar/'.

It's also unclear if the WSGI server is expected to normalize the path, 
specifically things like /foo/../bar -- Apache does do this, wsgiref 
does not.  (Is posixpath.normpath good enough to do that?)


Anyway, mostly I think we need some clarifications in the spec.  Though 
the normalization would add an additional requirement to the spec.

-- 
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org


More information about the Web-SIG mailing list