
Andy Robinson wrote:
- you can work with old fashioned strings, which are understood by everyone to be arrays of bytes, and there is no magic conversion going on. The bytes in literal strings in your script file are the bytes that end up in the program.
Who is "everyone"? Are you saying that CP4E hordes are going to understand that the syntax "abcde" is constructing a *byte array*? It seems like you think that Python users are going to be more sophisticated in their understanding of these issues than Java programmers. In most other things, Python is simpler.
...
I'm also convinced that the majority of Python scripts won't need to work in Unicode.
Anything working with XML will need to be Unicode. Anything working with the Win32 API (especially COM) will want to do Unicode. Over time the entire Web infrastructure will move to Unicode. Anything written in JPython pretty much MOST use Unicode (doesn't it?).
Even working with exotic languages, there is always a native 8-bit encoding.
Unicode has many encodings: Shift-JIS, Big-5, EBCDIC ... You can use 8-bit encodings of Unicode if you want. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself It's difficult to extract sense from strings, but they're the only communication coin we can count on. - http://www.cs.yale.edu/~perlis-alan/quotes.html