Re: [Python-Dev] Internal representation of strings and Micropython

4 Jun 2014


      Hello,

On Wed, 4 Jun 2014 12:32:12 +1000
Chris Angelico  wrote:
...
On Wed, Jun 4, 2014 at 11:17 AM, Steven D'Aprano
 wrote:
...
* Having a build-time option to restrict all strings to ASCII-only.
(I think what they mean by that is that strings will be like
Python 2 strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
What I was actually suggesting along those lines was that the str type
still be notionally a Unicode string, but that any codepoints >127
would either raise an exception or blow an assertion,
That's another reason why people don't like Unicode enforced upon them
- all the talk about supporting all languages and scripts is demagogy
and hypocrisy, given a choice, Unicode zealots would rather limit
people to Latin script then give up on their arbitrarily chosen,
one-among-thousands,
soon-to-be-replaced-by-apples'-and-microsofts'-"exciting-new" encoding.

Once again, my claim is what MicroPython implements now is more correct
- in a sense wider than technical - handling. We don't provide Unicode
encoding support, because it's highly bloated, but let people use any
encoding they like. That comes at some price, like length of strings in
characters are not know to runtime, only in bytes, but quite a lot of
applications can be written by having just that.

And I'm saying that not to discourage Unicode addition to MicroPython,
but to hint that "force-force" approach implemented by CPython3 and
causing rage and split in the community is not appreciated.
...
and all the code
to handle multibyte representations would be compiled out. So there'd
still be a difference between strings of text and streams of bytes,
but all encoding and decoding to/from ASCII-compatible encodings would
just point to the same bytes in RAM.
Risk: Someone would implement that with assertions, then compile with
assertions disabled, test only with ASCII, and have lurking bugs.
ChrisA
-- 
Best regards,
 Paul                          mailto:pmiscml@gmail.com

Re: [Python-Dev] Internal representation of strings and Micropython

Paul Sokolovsky