[Web-SIG] Request for Comments on upcoming WSGI Changes

Henry Precheur henry at precheur.org
Mon Sep 21 22:52:18 CEST 2009


On Mon, Sep 21, 2009 at 09:14:13PM +0200, Armin Ronacher wrote:
> So the same standard should have different behavior on different Python
> versions?  That would make framework code a lot more complicated.

I don't understand why it would be 'a lot more' complicated.

(The following code snippets is Python 3 only, and assumes we're using
'native strings' everywhere)

In the gateway, environ would be populated this way:

  environ['some_key'] = some_value.decode('utf8', 'surrogateescape')

Compare that to the utf-8-then-latin-1 alternative:

  try:
      environ['some_key'] = some_value.decode('utf-8')
      environ['some_key.encoding'] = 'utf-8'
  except UnicodeError:
      environ['some_key'] = some_value.decode('latin-1')
      environ['some_key.encoding'] = 'latin-1'


What you would have in the application to get the original value:

  environ['some_key'].encode('utf8', 'surrogateescape')

With utf8-then-latin1:

  environ['some_key'].encode(environ['some_key.encoding'])


The 'surrogateescape' way is clearly simpler. The 'equivalent' Python 2
code is even simpler:

  environ['some_key'] = some_value

And:

  environ['some_key']


-- 
  Henry Prêcheur


More information about the Web-SIG mailing list