I think it shouldn't be str's method.
They should be separate class to reuse internal tree.
There are some Aho Corasick implementation on PyPI. As far as I know, AC is longest match.
On the other hand, Go's replacer (it's trie based too) is:
Replacements are performed in order, without overlapping matches.
On Sun, Feb 4, 2018 at 7:04 AM, Franklin? Lee email@example.com wrote:
Let s be a str. I propose to allow these existing str methods to take params in new forms.
s.replace(old, new): Allow passing in a collection of olds. Allow passing in a single argument, a mapping of olds to news. Allow the olds in the mapping to be tuples of strings.
s.split(sep), s.rsplit, s.partition: Allow sep to be a collection of separators.
s.startswith, s.endswith: Allow argument to be a collection of strings.
s.find, s.index, s.count, x in s: Similar. These methods are also in `list`, which can't distinguish between items, subsequences, and subsets. However, `str` is already inconsistent with `list` here: list.M looks for an item, while str.M looks for a subsequence.
s.[r|l]strip: Sadly, these functions already interpret their str arguments as collections of characters.
These new forms can be optimized internally, as a search for multiple candidate substrings can be more efficient than searching for one at a time. See https://stackoverflow.com/questions/3260962/algorithm-to-find-multiple-strin...
The most significant change is on .replace. The others are simple enough to simulate with a loop or something. It is harder to make multiple simultaneous replacements using one .replace at a time, because previous replacements can form new things that look like replaceables. The easiest Python solution is to use regex or install some package, which uses (if you're lucky) regex or (if unlucky) doesn't simulate simultaneous replacements. (If possible, just use str.translate.)
I suppose .split on multiple separators is also annoying to simulate. The two-argument form of .split may be even more of a burden, though I don't know when a limited multiple-separator split is useful. The current best solution is, like before, to use regex, or install a package and hope for the best.
Python-ideas mailing list Pythonfirstname.lastname@example.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/