[Web-SIG] Request for Comments on upcoming WSGI Changes

René Dudfield renesd at gmail.com
Mon Sep 21 19:57:03 CEST 2009


On Mon, Sep 21, 2009 at 6:05 PM, Robert Brewer <fumanchu at aminus.org> wrote:
> Armin Ronacher wrote:
>> WSGI will demand UTF-8 URLs and only
>> provide iso-XXX support for backwards compatibility.
>
> WSGI cannot demand that; a recommendation for utf-8 in a few draft
> specifications is at least a decade removed from ubiquitous
> implementation. We can default to utf-8 at best. I discussed this at
> length in
> http://mail.python.org/pipermail/web-sig/2009-August/003948.html
>
>


Hi,

that post does have good arguments why "a single encoding is not
acceptable".  utf-8 seems the most common at this point to be the
default... but we do need a way to specify encoding.

Is that what you're saying Robert?  Do you have a suggestion for
specifying encodings?

I think surrogateescape will handle the issues with allowing bytes to
be stored in utf-8.
    http://www.python.org/dev/peps/pep-0383/

However, I think that is only implemented in python 3.1?... but maybe
there is someway to have it work on other pythons too?


How about...

Being able to request which encoding you want has the benefit of only
having to store one representation before 'baking' the result into the
environ.  So if someone only ever wants utf-8 they can get it...
however if they choose to 'bake' the environ then they can request
something else.  This is similar to a per server setting, but I think
should work with middleware too?  As multiple things should be
available, and if baked middleware (if it wants to modify things, will
need to change each version of things).

These 'baking' methods could live in wsgi to simplify modifying the
environs multiple versions of things.  It would just have some get/set
functions to put correct handling of encodings in one place.  Of
course middleware is still free to change things as it wants.


cheers,


More information about the Web-SIG mailing list