I'm agree with Daniel. Directly indexing into text suggests an attempted optimization that is likely to be incorrect for a set of strings. Splitting, regex, concatenation and formatting are really the main operations that matter, and MicroPython can optimize their implementation of these easily enough for O(N) indexing.


Top-posted from my Windows Phone

From: Daniel Holth
Sent: 6/4/2014 5:17
To: Paul Sokolovsky
Cc: python-dev
Subject: Re: [Python-Dev] Internal representation of strings and Micropython

If we're voting I think representing Unicode internally in micropython
as utf-8 with O(N) indexing is a great idea, partly because I'm not
sure indexing into strings is a good idea - lots of Unicode code
points don't make sense by themselves; see also grapheme clusters. It
would probably work great.

On Wed, Jun 4, 2014 at 7:49 AM, Paul Sokolovsky <pmiscml@gmail.com> wrote:
> Hello,
> On Wed, 4 Jun 2014 20:53:46 +1000
> Chris Angelico <rosuav@gmail.com> wrote:
>> On Wed, Jun 4, 2014 at 8:38 PM, Paul Sokolovsky <pmiscml@gmail.com>
>> wrote:
>> > And I'm saying that not to discourage Unicode addition to
>> > MicroPython, but to hint that "force-force" approach implemented by
>> > CPython3 and causing rage and split in the community is not
>> > appreciated.
>> FWIW, it's Python 3 (the language) and not CPython 3.x (the
>> implementation) that specifies Unicode strings in this way.
> Yeah, but it's CPython what dictates how language evolves (some people
> even think that it dictates how language should be implemented!), so all
> good parts belong to Python3, and all bad parts - to CPython3,
> right? ;-)
>> I don't
>> know why it has to cause a split in the community; this is the one way
>> to make sure *everyone's* strings work perfectly, rather than having
>> ASCII strings work fine and others start tripping over problems in
>> various APIs.
> It did cause split in the community, that's the fact, that's why
> Python2 and Python3 are at the respective positions. Anyway, I'm not
> interested in participating in that split, I did not yet uttered my
> opinion on that publicly enough, so I seized a chance to drop some
> witty remarks, but I don't want to start yet another Unicode flame.
> So, let's please be back to Unicode storage representation in
> MicroPython. So, https://github.com/micropython/micropython/issues/657
> discussed technical aspects, in a recent mail on this list I expressed
> my opinion why following CPython way is not productive (for development
> satisfaction and evolution of Python community, to be explicit).
> Final argument I would have is that you certainly can implement Unicode
> support the PEP393 way - it would be enormous help and would be gladly
> accepted. The question, how useful it will be for MicroPython. It
> certainly will be useful to report passing of testsuites. But will it
> be *really* used?
> For microcontroller board, it might be too heavy (put simple, with it,
> people will be able to do less (== heap running out sooner)), than
> without it, so one may expect it to be disabled by default. Then POSIX
> port is there surely not to let people replace "python" command
> with "micropython" and run Django, but to let people develop and debug
> their apps with more comfort than on embedded board. So, it should
> behave close to MCU version, and would follow with MCU choice
> re: Unicode.
> That's actually the reason why I keep up this discussion - not for the
> sake of argument or to bash Python3's Unicode choices. With recent
> MicroPython announcement, we surely looked for more people to
> contribute to its development. But then we (or at least I can speak for
> myself), would like to make sure that these contribution are actually
> the most useful ones (for both MicroPython, and Python community in
> general, which gets more choices, rather than just getting N% smaller
> CPython rewrite).
> So, you're not sure how O(N) string indexing will work? But MicroPython
> offers a great opportunity to try! And it's something new and exciting,
> which surely will be useful (== will save people memory), not just
> something old and boring ;-).
>> ChrisA
> --
> Best regards,
>  Paul                          mailto:pmiscml@gmail.com
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
Python-Dev mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40microsoft.com