[Web-SIG] Python 3.0 and WSGI 1.0.

Alan Kennedy alan at xhaus.com
Thu Apr 2 13:19:34 CEST 2009


[Sylvain]
> Would there be any interest in asking the HTTP-BIS working group [1] what
> they think about it?
>
> Currently I couldn't find anything in their drafts suggesting they had
> decided to clarify this issue from a protocol's perspective but they might
> consider it to be relevant to their goals.
>
> - Sylvain
>
> [1] http://www.ietf.org/html.charters/httpbis-charter.html

I checked the current version of their replacement for RFC 2616. It says

"""
2.1.3.  URI Comparison

   When comparing two URIs to decide if they match or not, a client
   SHOULD use a case-sensitive octet-by-octet comparison of the entire
   URIs
"""

Which doesn't work if the two URIs to be compared are in different encodings.

I did find this page on the W3C site which at least explains the
issues, and does a survey of existing modern browsers for how they
encode URIs and IRIs.

http://www.w3.org/International/articles/idn-and-iri/

"""
Paths

The conversion process for parts of the IRI relating to the path is
already supported natively in the latest versions of IE7, Firefox,
Opera, Safari and Google Chrome.

It works in Internet Explorer 6 if the option in Tools>Internet
Options>Advanced>Always send URLs as UTF-8 is turned on. This means
that links in HTML, or addresses typed into the browser's address bar
will be correctly converted in those user agents. It doesn't work out
of the box for Firefox 2 (although you may obtain results if the IRI
and the resource name are in the same encoding), but technically-aware
users can turn on an option to support this (set
network.standard-url.encode-utf8 to true in about:config).

Whether or not the resource is found on the server, however, is a
different question. If the file system is in UTF-8, there should be no
problem. If not, and no mechanism is available to convert addresses
from UTF-8 to the appropriate encoding, the request will fail.

Files are normally exposed as UTF-8 by servers such as IIS and Apache
2 on Windows and Mac OS X. Unix and Linux users can store file names
in UTF-8, or use the mod_fileiri module mentioned earlier. Version 1
of the Apache server doesn't yet expose filenames as UTF-8.

You can run a basic check whether it works for your client and
resource using this simple test.

Note that, while the basics may work, there are other somewhat more
complicated aspects of IRI support, such as handling of bidirectional
text in Arabic or Hebrew, which may need some additional time for full
implementation.
"""

Alan.


More information about the Web-SIG mailing list