[Web-SIG] Proposal to remove SCRIPT_NAME/PATH_INFO

Wed Sep 23 07:25:45 CEST 2009

On Tue, Sep 22, 2009 at 09:22:48PM -0500, Ian Bicking wrote:
> Well, the biggie: is it right to use native strings for the environ values,
> and response status/headers?  Specifically, tricks like the latin1
> transcoding won't work in Python 2, but will in Python 3.  Is this weird?
> Or just something you have to think about when using the two Python
> versions?

I don't have the whole discussion in mind. But except 'using unicode
everywhere', I don't think there's a single proposal that would allow
people to keep to same 'logic' in both Python 2 & 3.

Using bytes in Python 3 requires you to have 2 different 'logic' for
Python 2 and 3, because of the limitation of bytes which can't do all
what str can do and the stdlib's problems with bytes.

Using str in Python 3 requires you to have 2 different 'logic' too.
Because Python 3's str are not Python 2's str.

(Just to make things clear the term 'logic' refers to transcoding of
strings into the correct encoding)

> What happens if you give unicode text in the response headers that cannot be
> encoded as Latin1?

We can ignore the header. But if a response header contains non-Latin-1
characters, it's not WSGI compliant, I would therefor expect an error.

To cite The Zen of Python:

    Errors should never pass silently.

> Should some things be unicode on Python 2?

No. I think it's more important to keep WSGI simple. Let's use str
everywhere. Frameworks can always transcode what should be Unicode,
that's their job.

> Is there a common case here that would be inefficient?

Transcoding every strings from Latin-1 to Unicode could be time
consuming. The only way I see to make things faster is to use bytes
everywhere, but that's not possible given the previous discussions.

-- 
  Henry Prêcheur