[Web-SIG] PEP 444 (aka Web3)

Ian Bicking ianb at colorstudy.com
Fri Sep 17 04:21:34 CEST 2010


On Thu, Sep 16, 2010 at 9:59 PM, Armin Ronacher <armin.ronacher at active-4.com
> wrote:

>  On 9/17/10 3:43 AM, Ian Bicking wrote:
>
>> Not if you are working with the URL-encoded paths.
>>
>
> SCRIPT_NAME / PATH_INFO will always stay unencoded and the current spec
> requires the web3.script_name thing to only be provided if the server can
> safely provide that.  So at least for the fallback, we are dealing with
> (properly latin1 decoded) non-URL encoded things.  Can be changed of course.


Yes, if we get rid of SCRIPT_NAME/PATH_INFO then the problem goes away.  For
servers without access to the unencoded value, reencoding those values
doesn't actually lose any information over what we have now, and avoids any
encoding issues.  Servers with REQUEST_URI can at least attempt to
reconstruct the encoded values.



>
>  Cookie is weird.  If that one header could be bytes, that'd be great...
>> but special-casing Cookie/Set-Cookie is too hard/weird.
>>
> Special casing one header is indeed weird.


Cookie is also the one header that can't be safely folded.  It's just a
messed up header, and requires hacky workarounds.



>
>  I don't know of any other header (or the status) that would reasonably
>> cause a problem.  And I'm not glossing over corner cases -- I'm
>> generally very aware and concerned with legacy issues, and interacting
>> with legacy systems.  There just aren't any here except for the
>> resolvable issues I've listed.
>>
> Technically speaking it would affect etags too, but I doubt anyone is using
> non-ASCII quoted strings there.  A very funny header is btw the Warning
> header which actually can have any encoding:
>
> "The warn-text SHOULD be in a natural language and character set that is
> most likely to be intelligible to the human user receiving the response.
> This decision MAY be based on any available knowledge, such as the location
> of the cache or user, the Accept-Language field in a request, the
> Content-Language field in a response, etc. The default language is English
> and the default character set is ISO-8859-1.
>
> If a character set other than ISO-8859-1 is used, it MUST be encoded in the
> warn-text using the method described in RFC 2047 [14]."
>
> Doubt anyone is using that header though.
>

The Title header (in Atompub) also suggests 2047, but that's essentially an
ASCII conversion like URL quoting. It looks something like
=?iso-8859-1?q?p=F6stal?=


-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20100916/2eff00af/attachment-0001.html>


More information about the Web-SIG mailing list