[Python-ideas] unicodedata.itergraphemes (or str.itergraphemes / str.graphemes)

Philipp A. flying-sheep at web.de
Wed Jul 10 14:04:13 CEST 2013


2013/7/10 David Kendal <me at dpk.io>

Well, right. I meant “a new type” like dict.keys() and dict.values() are
“view types”
on a dictionary that provide iterator interfaces. This would just be a
“grapheme view” on a string.

i think that’s the way to go. who would want dozens of new functions in
unicodedata?

how about something like the following? it can easily be extended to get a
reverse iterator.

setting its pos and calling find_grapheme or __next__ or previous allows
for bruce’s usecases.

class GraphemeIterator:
    def __init__(self, string, start=0):
        self.string = string
        self.pos = start

    def __iter__(self):
        return self

    def __next__(self):
        _, next_pos, grapheme = self.find_grapheme()
        self.pos = next_pos
        return grapheme

    def previous(self):
        prev_pos, _, grapheme = self.find_grapheme(backwards=True)
        self.pos = prev_pos
        return grapheme

    def find_grapheme(self, i=None, *, backwards=False):
        """finds next complete grapheme in string, starting at position i
        if backwards is not set, finds grapheme starting at i, or the
next one if i is in the middle of one
        if it is set, it finds the grapheme which i points to, even if
that’s the middle.
        if str[i] is the beginning of a grapheme, backwards finds the
one before it.
        """
        if i is None:
            i = self.pos
        ...
        return (start, end, grapheme)
def find_grapheme(string, i, backwards=False):
    """ convenience function for oneshotting it """
    return GraphemeIterator(string, i).find_grapheme(backwards=backwards)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130710/ad64cce5/attachment.html>


More information about the Python-ideas mailing list