I'm agree with Daniel. Directly indexing into text suggests an attempted optimization that is likely to be incorrect for a set of strings. Splitting, regex, concatenation and formatting are really the main operations that matter, and MicroPython can optimize their implementation of these easily enough for O(N) indexing.
Top-posted from my Windows Phone ________________________________ From: Daniel Holthmailto:firstname.lastname@example.org Sent: 6/4/2014 5:17 To: Paul Sokolovskymailto:email@example.com Cc: python-devmailto:firstname.lastname@example.org Subject: Re: [Python-Dev] Internal representation of strings and Micropython
If we're voting I think representing Unicode internally in micropython as utf-8 with O(N) indexing is a great idea, partly because I'm not sure indexing into strings is a good idea - lots of Unicode code points don't make sense by themselves; see also grapheme clusters. It would probably work great.
On Wed, Jun 4, 2014 at 7:49 AM, Paul Sokolovsky email@example.com wrote:
On Wed, 4 Jun 2014 20:53:46 +1000 Chris Angelico firstname.lastname@example.org wrote:
On Wed, Jun 4, 2014 at 8:38 PM, Paul Sokolovsky email@example.com wrote:
And I'm saying that not to discourage Unicode addition to MicroPython, but to hint that "force-force" approach implemented by CPython3 and causing rage and split in the community is not appreciated.
FWIW, it's Python 3 (the language) and not CPython 3.x (the implementation) that specifies Unicode strings in this way.
Yeah, but it's CPython what dictates how language evolves (some people even think that it dictates how language should be implemented!), so all good parts belong to Python3, and all bad parts - to CPython3, right? ;-)
I don't know why it has to cause a split in the community; this is the one way to make sure *everyone's* strings work perfectly, rather than having ASCII strings work fine and others start tripping over problems in various APIs.
It did cause split in the community, that's the fact, that's why Python2 and Python3 are at the respective positions. Anyway, I'm not interested in participating in that split, I did not yet uttered my opinion on that publicly enough, so I seized a chance to drop some witty remarks, but I don't want to start yet another Unicode flame.
So, let's please be back to Unicode storage representation in MicroPython. So, https://github.com/micropython/micropython/issues/657 discussed technical aspects, in a recent mail on this list I expressed my opinion why following CPython way is not productive (for development satisfaction and evolution of Python community, to be explicit).
Final argument I would have is that you certainly can implement Unicode support the PEP393 way - it would be enormous help and would be gladly accepted. The question, how useful it will be for MicroPython. It certainly will be useful to report passing of testsuites. But will it be *really* used?
For microcontroller board, it might be too heavy (put simple, with it, people will be able to do less (== heap running out sooner)), than without it, so one may expect it to be disabled by default. Then POSIX port is there surely not to let people replace "python" command with "micropython" and run Django, but to let people develop and debug their apps with more comfort than on embedded board. So, it should behave close to MCU version, and would follow with MCU choice re: Unicode.
That's actually the reason why I keep up this discussion - not for the sake of argument or to bash Python3's Unicode choices. With recent MicroPython announcement, we surely looked for more people to contribute to its development. But then we (or at least I can speak for myself), would like to make sure that these contribution are actually the most useful ones (for both MicroPython, and Python community in general, which gets more choices, rather than just getting N% smaller CPython rewrite).
So, you're not sure how O(N) string indexing will work? But MicroPython offers a great opportunity to try! And it's something new and exciting, which surely will be useful (== will save people memory), not just something old and boring ;-).
-- Best regards, Paul mailto:firstname.lastname@example.org _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40microsoft.c...