[Web-SIG] Request for Comments on upcoming WSGI Changes
Henry Precheur
henry at precheur.org
Mon Sep 21 22:52:18 CEST 2009
On Mon, Sep 21, 2009 at 09:14:13PM +0200, Armin Ronacher wrote:
> So the same standard should have different behavior on different Python
> versions? That would make framework code a lot more complicated.
I don't understand why it would be 'a lot more' complicated.
(The following code snippets is Python 3 only, and assumes we're using
'native strings' everywhere)
In the gateway, environ would be populated this way:
environ['some_key'] = some_value.decode('utf8', 'surrogateescape')
Compare that to the utf-8-then-latin-1 alternative:
try:
environ['some_key'] = some_value.decode('utf-8')
environ['some_key.encoding'] = 'utf-8'
except UnicodeError:
environ['some_key'] = some_value.decode('latin-1')
environ['some_key.encoding'] = 'latin-1'
What you would have in the application to get the original value:
environ['some_key'].encode('utf8', 'surrogateescape')
With utf8-then-latin1:
environ['some_key'].encode(environ['some_key.encoding'])
The 'surrogateescape' way is clearly simpler. The 'equivalent' Python 2
code is even simpler:
environ['some_key'] = some_value
And:
environ['some_key']
--
Henry Prêcheur
More information about the Web-SIG
mailing list