On Thu, Sep 16, 2010 at 4:58 PM, Armin Ronacher <span dir="ltr"><<a href="mailto:armin.ronacher@active-4.com">armin.ronacher@active-4.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
- Bytes values in the environment:<br>
<br>
HTTP transmits bytes, that's a fact we can't change. When we go<br>
with native strings we will go with unicode on 3.x This has the<br>
following implications:<br>
<br>
- getting the right path info requires a decode + an encode<br>
unless you are assuming latin1.<br></blockquote><div><br>Not if you are working with the URL-encoded paths.<br> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
- same as above for the script name and cookie header<br></blockquote><div><br>Cookie is weird. If that one header could be bytes, that'd be great... but special-casing Cookie/Set-Cookie is too hard/weird.<br><br>
Plus handling Cookie/Set-Cookie as Latin1 is just one more line of code (well, two, one for each header).<br><br></div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
When going with unicode strings on 3.x for environ values, we would<br>
have to do the same for outgoing values which makes middlewares a lot<br>
harder to write:<br></blockquote><div><br>All response headers handle encoded URLs (e.g., Location), so SCRIPT_NAME/PATH_INFO issues don't come into play. Set-Cookie could be an issue, though only really when someone wants to replicate an external system's weird cookies -- except for legacy issues it's best for application developers to stick to ASCII cookies (URL-encoding cookie values is a popular way of doing this).<br>
<br>I don't know of any other header (or the status) that would reasonably cause a problem. And I'm not glossing over corner cases -- I'm generally very aware and concerned with legacy issues, and interacting with legacy systems. There just aren't any here except for the resolvable issues I've listed.<br>
<br></div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
- web3.errors<br>
<br>
I think Ian raised concern that it's specified to support unicode<br>
only. I don't think we should change that to accepting either bytes<br>
or unicode is a good idea on Python 3 where there is no stream in<br>
the language or standard library that accepts both at the same time.<br>
An implementation for 2.x could support both, but I don't know if<br>
there is a usecase for that. In general though I have to say that<br>
very few people use wsgi.errors currently, so I don't think this is<br>
a real issue anyways.<br></blockquote><div><br>It's more of an issue under Python 2, it could probably be ignored with Python 3. Under Python 2 when you have some error condition it's really frustrating to encounter some unicode error with the logging of that error (often covering up the original error).<br>
<br><br></div></div>-- <br>Ian Bicking | <a href="http://blog.ianbicking.org">http://blog.ianbicking.org</a><br>