[Python-ideas] unicodedata.itergraphemes (or str.itergraphemes / str.graphemes)
Terry Reedy
tjreedy at udel.edu
Tue Jul 9 23:17:31 CEST 2013
On 7/9/2013 12:51 PM, Bruce Leban wrote:
> If you want to do any operation on the clusters other than in iteration
> order, without indexed access you're going to end up doing
> list(grapheme_clusters(...)) first to give you indexed access. Maybe
> that's the right thing to do sometimes but I wouldn't force it on
> people. The string already provides indexed access but I need to know
> cluster boundaries.
I think the best alternative to a list subclass of grapheme substrings
(a subclass so can add methods), might be a GraphemeSeq wrapper class
that contains a string (perhaps in a known normal form) and a list of
indexes to grapheme start positions. That would also allow
grapheme-oriented methods. If not already done, either or both of these
would be good pypi modules.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list