[Python-3000] How will unicode get used?
Adam Olsen
rhamph at gmail.com
Wed Sep 20 14:55:56 CEST 2006
Before we can decide on the internal representation of our unicode
objects, we need to decide on their external interface. My thoughts
so far:
* Most transformation and testing methods (.lower(), .islower(), etc)
can be copied directly from 2.x. They require no special
implementation to perform reasonably.
* Indexing and slicing is the big issue. Do we need constant-time
integer slicing? .find() could be changed to return a token that
could be used as a constant-time offset. Incrementing the token would
have linear costs, but that's no big deal if the offsets are always
small.
* Grapheme clusters, words, lines, other groupings, do we need/want
ways to slice based on them too?
* Cheap slicing and concatenation (between O(1) and O(log(n))), do we
want to support them? Now would be the time.
--
Adam Olsen, aka Rhamphoryncus
More information about the Python-3000
mailing list