[Web-SIG] Request for Comments on upcoming WSGI Changes
renesd at gmail.com
Mon Sep 21 18:00:06 CEST 2009
On Mon, Sep 21, 2009 at 4:42 PM, P.J. Eby <pje at telecommunity.com> wrote:
> At 04:30 PM 9/21/2009 +0100, René Dudfield wrote:
>> On Mon, Sep 21, 2009 at 4:19 PM, P.J. Eby <pje at telecommunity.com> wrote:
>> > At 12:25 AM 9/21/2009 -0400, Chris McDonough wrote:
>> >> Anyway, for us slower (and maybe wrongly fearful) folks, could someone
>> >> summarize the benefits of having a WSGI specification that requires
>> >> Unicode.
>> >> Bonus points for an explanation that does not boil down to "it will be
>> >> compatible with Python 3".
>> > +1. I'd really rather not have the spec dictated by the need to work
>> > around
>> > problems in the stdlib or language definition. Better to fix them ASAP.
>> here is a summary:
>> Apart from python3 compatibility(which should be good enough
>> reason), utf-8 is what's used in http a lot these days. Most things
>> layered on top of wsgi are using utf-8 (django etc), and lots of web
>> clients are using utf-8 (firefox etc).
> Since WSGI is based on HTTP, please cite RFCs, not applications. Thanks.
That seems a strange thing to say. HTTP use is based on not only RFCs
but real applications. Web Server Gateway Interface is not just about
HTTP obviously, and talks about python and web server issues... it
hardly restricts itself to HTTP.
See IRIs: http://www.w3.org/International/O-URL-and-ident.html
Which links to a number of things including rfc2718, which specifies
utf-8 for URIs: http://www.ietf.org/rfc/rfc2718.txt
Character encoding section:
"""Unless there is some compelling reason for a particular scheme to
do otherwise, translating character sequences into UTF-8 (RFC 2279)
 and then subsequently using the %HH encoding for unsafe octets is
Which seems sensible.
Having fallback to the raw bytes available also seems sensible. For
the reasons discussed in previous posts.
More information about the Web-SIG