Find and Replace Simplification

Joshua Landau joshua at landau.ws
Sat Jul 20 13:16:35 CEST 2013


On 19 July 2013 18:29, Serhiy Storchaka <storchaka at gmail.com> wrote:
> 19.07.13 19:22, Steven D'Aprano написав(ла):
>
>> I also expect that the string replace() method will be second fastest,
>> and re.sub will be the slowest, by a very long way.
>
>
> The string replace() method is fastest (at least in Python 3.3+). See
> implementation of html.escape() etc.

def escape(s, quote=True):
    if quote:
        return s.translate(_escape_map_full)
    return s.translate(_escape_map)

I fail to see how this supports the assertion that str.replace() is
faster. However, some quick timing shows that translate has a very
high penalty for missing characters and is a tad slower any way.

Really, though, there should be no reason for .translate() to be
slower than replace -- at worst it should just be "reduce(lambda s,
ab: s.replace(*ab), mapping.items()¹, original_str)" and end up the
*same* speed as iterated replace. But the fact that it doesn't have to
re-build the string every replace means that theoretically it should
be a lot faster.

¹ I realise this won't actually work for several reasons, and doesn't
support things like passing in lists as mappings, but you could
trivially support the important builtin types² and fall back to the
original for others, where the pure-python __getitem__ is going to be
the slowest part anyway.

² List, tuple, dict, str, bytes -- so basically just mappings and
ordered iterables



More information about the Python-list mailing list