04.02.18 00:04, Franklin? Lee пише:The name of complicated str methods is regular expressions. For doing these operations efficiently you need to convert arguments in special optimized form. This is what re.compile() does. If make a compilation on every invocation of a str method, this will add too large overhead and kill performance.
Let s be a str. I propose to allow these existing str methods to take params in new forms.
s.replace(old, new):
Allow passing in a collection of olds.
Allow passing in a single argument, a mapping of olds to news.
Allow the olds in the mapping to be tuples of strings.
s.split(sep), s.rsplit, s.partition:
Allow sep to be a collection of separators.
s.startswith, s.endswith:
Allow argument to be a collection of strings.
s.find, s.index, s.count, x in s:
Similar.
These methods are also in `list`, which can't distinguish between items, subsequences, and subsets. However, `str` is already inconsistent with `list` here: list.M looks for an item, while str.M looks for a subsequence.
s.[r|l]strip:
Sadly, these functions already interpret their str arguments as collections of characters.
Even for simple string search a regular expression can be more efficient than a str method.
$ ./python -m timeit -s 'import re; p = re.compile("spam"); s = "spa"*100+"m"' -- 'p.search(s)'
500000 loops, best of 5: 680 nsec per loop
$ ./python -m timeit -s 's = "spa"*100+"m"' -- 's.find("spam")'
200000 loops, best of 5: 1.09 usec per loop