[issue10155] Add fixups for encoding problems to wsgiref

And Clover report at bugs.python.org
Thu Nov 4 04:55:09 CET 2010

And Clover <and at doxdesk.com> added the comment:


Some of those additions in _needs_transcode are potentially controversial, though. I'm not wholly sure it's the right thing to transcode these.

Some of them may not actually come from the request, eg `REMOTE_USER` may be filled in by IIS's Windows authentication using a native-Unicode string from the Windows user database. Is it the right thing to turn it into UTF-8-bytes-in-Unicode for consistency with Apache? Maybe. (At least for most of the other new envvars there will never see a non-ASCII character. Or in `REMOTE_IDENT`'s case never be used for anything.)

The case with the REDIRECT_HTTP_ and SSL_ envvars is an interesting one. Whilst transcoding them at some point will very probably be what applications need to do if they want to actually use them, is it within CGIHandler's remit to change Apache mod-specific variables that are not specified by CGI or WSGI?

(There might, after all, be lots of these to catch for other mods and servers, and it's *conceivable* that somebody might be re-using one of these names to set in the environment for some other purpose, in which case transcoding would be adding an unexpected mangling. We can't in the general case expect users to know to avoid envvar names are used as non-standard extensions in all servers.)

REDIRECT_HTTP_ at least comes from the HTTP request, so I guess the consistency is good there. (But then I think the only header that actually may contain non-ASCII is REDIRECT_URL, which replaces the unescaped SCRIPT_NAME and PATH_INFO; that one isn't caught at the moment.)

versions:  -Python 2.7

Python tracker <report at bugs.python.org>

More information about the Python-bugs-list mailing list