[Web-SIG] PEP 444 (aka Web3)
and-py at doxdesk.com
Fri Sep 17 15:43:21 CEST 2010
On 09/17/2010 02:03 PM, Armin Ronacher wrote:
> In case we change the spec as Ian mentioned above, I am all for
> a "wsgi.guessed_encoding" = True flag or something like that.
Yes, I'd like to see that. I believe going with *only* a
raw-or-reconstructed path_info, rather than having both path_info and
PATH_INFO, is probably best, for the middleware-dupication reasons PJE
A more in-depth possibility might be:
0: script_name/path_info have been crudely reconstructed from
SCRIPT_NAME/PATH_INFO from an unknown source. Beware!
If there is to be backwards compatibility with WSGI1, this
would be seen as the 'default value' given a missing path_accuracy.
1: script_name/path_info have been reconstructed, but it is known
that path_info is accurate, other than %2F and non-ASCII issues.
That is, it's known that the path doesn't come from IIS's broken
PATH_INFO, or the IIS error has been detected and compensated for.
2: script_name/path_info have been reconstructed using known-good
encodings for the env. The only way in which they may differ from
the original request path is that a slash might originally have
been a %2F. (This is good enough for the vast majority of
3: script_name/path_info come directly from the request path
without any intervening mangling.
> Unless I am mistaken, the same is true for CGI scripts running on
> Apache2 on Windows.
Yes, it's true of *all* CGI scripts, but also for non-CGI scripts on IIS.
> I did some tests a while ago and was pretty sure that Apache2 on Windows
> did the same.
Apache-on-Windows puts the bytes of the decoded path into the
environment variables as one code unit per byte: that is, as if encoded
by ISO-8859-1. You still have to read the environ using ctypes because
mbcs is never ISO-8859-1, but at least the original bytes are
recoverable, which isn't the case with IIS.
> The correct place for these hacks would be the appropriate WSGI/Web3
> handler of the webserver.
The IIS PATH_INFO-prefix hack would be appropriate to put in an
IIS-specific handler; indeed, I believe isapi_wsgi does just that. But
the other hacks are specific to CGI.
For CGI, there is no 'handler of the webserver', there is only the
standard CGI-to-WSGI adapter, so this is the only component it is
reasonable to burden with the hacks. Frameworks and libraries further up
the stack cannot reliably do the fixups, because they don't know whether
the WSGI environ they have been given comes from os.environ or somewhere
else, or whether middleware has played with it.
mailto:and at doxdesk.com
More information about the Web-SIG