[Python-Dev] Bytes path support
Nikolaus at rath.org
Wed Aug 27 03:39:35 CEST 2014
Nick Coghlan <ncoghlan at gmail.com> writes:
>>>> As some examples of where bilingual computing breaks down:
>>>> * My NFS client and server may have different locale settings
>>>> * My FTP client and server may have different locale settings
>>>> * My SSH client and server may have different locale settings
>>>> * I save a file locally and send it to someone with a different locale
>>>> * I attempt to access a Windows share from a Linux client (or
>>>> * I clone my POSIX hosted git or Mercurial repository on a Windows
>>>> * I have to connect my Linux client to a Windows Active Directory
>>>> domain (or vice-versa)
>>>> * I have to interoperate between native code and JVM code
>>>> The entire computing industry is currently struggling with this
>>>> monolingual (ASCII/Extended ASCII/EBCDIC/etc) -> bilingual (locale
>>>> encoding/code pages) -> multilingual (Unicode) transition. It's been
>>>> going on for decades, and it's still going to be quite some time
>>>> before we're done.
>>>> The POSIX world is slowly clawing its way towards a multilingual model
>>>> that actually works: UTF-8
>>>> Windows (including the CLR) and the JVM adopted a different
>>>> multilingual model, but still one that actually works: UTF-16-LE
>> Nick, I think the first half of your post is one of the clearest
> expositions yet of 'why Python 3' (in particular, the str to unicode
> change). It is worthy of wider distribution and without much change, it
> would be a great blog post.
> Indeed, I had the same idea - I had been assuming users already understood
> this context, which is almost certainly an invalid assumption.
> The blog post version is already mostly written, but I ran out of weekend.
> Will hopefully finish it up and post it some time in the next few days
In that case, maybe it'd be nice to also explain why you use the term
"bilingual" for codepage based encoding. At least to me, a
codepage/locale is pretty monolingual, or alternatively covering a whole
region (e.g. western europe). I figure with bilingual you mean ascii +
something, but that's mostly a guess from my side.
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
»Time flies like an arrow, fruit flies like a Banana.«
More information about the Python-Dev