On Thu, 5 Jun 2014 23:15:54 +1000 Nick Coghlan firstname.lastname@example.org wrote:
On 5 June 2014 22:37, Paul Sokolovsky email@example.com wrote:
On Thu, 5 Jun 2014 22:20:04 +1000 Nick Coghlan firstname.lastname@example.org wrote:
problems caused by trusting the locale encoding to be correct, but the startup code will need non-trivial changes for that to happen
- the C.UTF-8 locale may even become widespread before we get
... And until those golden times come, it would be nice if Python did not force its perfect world model, which unfortunately is not based on surrounding reality, and let users solve their encoding problems themselves - when they need, because again, one can go quite a long way without dealing with encodings at all. Whereas now Python3 forces users to deal with encoding almost universally, but forcing a particular for all strings (which is again, doesn't correspond to the state of surrounding reality). I already hear response that it's good that users taught to deal with encoding, that will make them write correct programs, but that's a bit far away from the original aim of making it write "correct" programs easy and pleasant. (And definition of "correct" vary.)
As I've said before in other contexts, find me Windows, Mac OS X and JVM developers, or educators and scientists that are as concerned by the text model changes as folks that are primarily focused on Linux system (including network) programming, and I'll be more willing to concede the point.
Well, but this question reduces to finding out (or specifying) who are target audiences of Python. It always has been (with a bow to Guido) forpost of scientific users (and probably even if there was mass exodus of other categories of users will remain prominent in that role). But Python has always had its share as system scripting language among Perl-haters, and with Perl going flatline, I guess it's fair to say that Python is major system scripting and service implementation language.
To whom all features like memoryview, array.array, in-place input operations, etc. cater? To scientists? I'm sure most of them are just happy with stuffing "@jit" for their kernel functions. And scientist who bother with memoryviews for their data structures are system-level-ish programmers too.
So, no wonder that Linux crowd cries at Python3 - it makes doing simple things unnecessarily complicated.
Windows, Mac OS X, and the JVM are all opinionated about the text encodings to be used at platform boundaries (using UTF-16, UTF-8 and UTF-16, respectively). By contrast, Linux (or, more accurately, POSIX) says "well, it's configurable, but we won't provide a reliable mechanism for finding out what the encoding is. So either guess as
Yes, I understand complexity of developing cross-platform language with advanced features. By I may offer another look at all this activity: Python3 was brave enough to do revolution in its own world (catching a lot of its users by surprise), but surely not brave enough to do revolution around itself, by saying something like "We choose ONE, the most right, and even the most used (per bytes transferred) encoding as our standard I/O encoding. Grow up or explicitly specify encoding which you personally need.".
Surely, it didn't to that - it makes no sense to fight the world. But then Python3 is sympathetic about Java's desire to use "UTF-16" instead of "right" encoding, and no so about Unix desire to treat encodings as a separate level from content (and treating Unicode by nothing else as yet another arbitrary encoding, which it is formally, and will be for a long time de-facto, however sad it is). So, maybe "cross-platform" should have mean "don't do implicit conversions". Because see, Python2 had a problem with implicit encoding conversion when str and unicode objects were mixed, and Python3 has problem with implicit conversions whenever str is used at all.
Anyway, I appreciate detailed responses, and understand what you (Python3 developers) are trying to achieve, and appreciate your work, and hope it all work out. Each user has own concerns about Unicode. Mine are efficiency and layering. But once MicroPython has UTF-8 support I will be much more relaxed about it. Layering is harder to accept, but hopefully can be tackled too both on own mind's and technical sides. I hope other users will find their peace with Unicode too!