[Web-SIG] Python 3.0 and WSGI 1.0.

Ian Bicking ianb at colorstudy.com
Wed May 6 05:27:17 CEST 2009


On Tue, May 5, 2009 at 10:14 PM, Graham Dumpleton <
graham.dumpleton at gmail.com> wrote:

> 2009/5/6 Ian Bicking <ianb at colorstudy.com>:
> > Philip Jenvey brought this to my attention:
> >
> >   http://www.python.org/dev/peps/pep-0383/
> >
> > It's a UTF8 encoding and decoding scheme that encodes illegal bytes in
> such
> > a way that you can decode to get the original bytes object, and thus
> > transcode to another encoding.  It's intended for cases exactly like
> WSGI.
>
> Care to explain then how that would in practice be used while I try
> and reread it a few times to try and understand it myself? :-)
>

I don't particularly know, except I think you'd do things like:

environ['PATH_INFO'] = urllib.unquote(http_byte_path).decode('utf8',
'python-escape')

Then if the encoding was wrong, you could transcode like:

environ['PATH_INFO'] = environ['PATH_INFO'].encode('utf8',
'python-escape').decode('latin1', 'python-escape')

Note that you need to know the encoding that was used (utf8 in this case)
and that python-escape was used.  It has been suggested that the server
should put the encoding it used into the environment.  When transcoding this
should also be updated.

It's not clear what python-escape is going to do, I don't think that's been
determined.  Probably it'll put \x00 or something in the unicode string to
mark raw bytes.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20090505/6c3c6bc7/attachment-0001.htm>


More information about the Web-SIG mailing list