[Python-Dev] Internal representation of strings and Micropython

Paul Sokolovsky pmiscml at gmail.com
Thu Jun 5 03:01:38 CEST 2014


Hello,

On Thu, 05 Jun 2014 12:03:17 +1200
Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> Serhiy Storchaka wrote:
> > html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize
> > don't use iterators. They use indices, str.find and/or regular
> > expressions. Common use case is quickly find substring starting
> > from current position using str.find or re.search, process found
> > token, advance position and repeat.
> 
> For that kind of thing, you don't need an actual character
> index, just some way of referring to a place in a string.
> 
> Instead of an integer, str.find() etc. could return a
> StringPosition, 

That's more brave then I had in mind, but definitely shows what
alternative implementation have in store to fight back if some
perfomance problems are actually detected. My own thoughts were, for
example, as response to people who (quoting) "slice strings for living"
is some form of "extended slicing" like str[(0, 4, 6, 8, 15)].

But I really think that providing iterator interface for common string
operations would cover most of real-world cases, and will be actually
beneficial for Python language in general.

> 
> -- 
> Greg


-- 
Best regards,
 Paul                          mailto:pmiscml at gmail.com


More information about the Python-Dev mailing list