[Python-ideas] unicodedata.itergraphemes (or str.itergraphemes / str.graphemes)
Masklinn
masklinn at masklinn.net
Tue Jul 9 10:31:27 CEST 2013
On 2013-07-09, at 07:30 , Stephen J. Turnbull wrote:
> Bruce Leban writes:
>
>> On Sun, Jul 7, 2013 at 3:29 AM, David Kendal <me at dpk.io> wrote:
>>> But there's no way to iterate over Unicode graphemes
>
>> A common case is wanting to extract the current grapheme or move
>> forward or backward one. Please consider these other use cases
>> rather than just adding an iterator.
>
>> g = unicodedata.grapheme_cluster(str, i)
>> # extracts cluster that includes index i (i may be in the middle
>> # of the cluster)
>
> Why is indexing a string and returning a grapheme a common case?
I don't know about that but I do know NSString provides two messages
for that (one takes an index in a string and returns the corresponding
grapheme boundaries — rangeOfComposedCharacterSequenceAtIndex:; and
the other takes a range and returns the range of all composing graphemes
— rangeOfComposedCharacterSequencesForRange:).
Of course that might just be because it does not provide a higher-level
iterator on graphemes.
More information about the Python-ideas
mailing list