[Python-Dev] Python-3.0, unicode, and os.environ

André Malo nd at perlig.de
Sat Dec 13 05:47:47 CET 2008


* Adam Olsen wrote:

> On Fri, Dec 12, 2008 at 2:11 AM, André Malo <nd at perlig.de> wrote:
> > * Adam Olsen wrote:
> >> UTF-8 in percent encodings is becoming a defacto standard.  Otherwise
> >> the browser has to display the percent escapes in the address bar,
> >> rather than the intended text.
> >
> > Duh! The address bar should contain the URL, which *is* the intended
> > text. The escapes are there for a reason. If I pass some octets using
> > percent escapes via the query string or request body, it's not text,
> > not even intended. It's still a collection of octets. Translating them
> > back (and forth when I press enter in the address bar) is a pretty
> > ambigious operation and therefore pretty wrong.
> >
> > The defacto standard does not exist. There's a real one instead: RFC
> > 2396.
>
> All the heaps of people using non-english wikipedia sites might
> disagree with you.  There's only, what, a few *million* pages that
> would be affected?

I'm not sure what you're trying to pull here. Is that supposed to be an 
argument? There's no page affected at all. It's a browser UI issue, not a 
page issue.

And even if it were interesting at all, how the URL escapes are displayed in 
the address bar, those millions of people would favourite KOI8-R or Big 5 
over UTF-8 if you would ask them.

Which leads to the exact point: The browser cannot know, nor should it even. 
It's opaque. The only entity which needs to understand the encoding of URL 
percent escapes in query or request body is the *server* selecting the 
resource.

But I'm sure I'm not telling you any news here.

nd
-- 
"Das Verhalten von Gates hatte mir bewiesen, dass ich auf ihn und seine
beiden Gefährten nicht zu zählen brauchte" -- Karl May, "Winnetou III"

Im Westen was neues: <http://pub.perlig.de/books.html#apache2>


More information about the Python-Dev mailing list