What is built-in method sub
steve at REMOVE-THIS-cybersource.com.au
Tue Jan 12 00:25:44 CET 2010
On Mon, 11 Jan 2010 13:51:48 -0800, Chris Rebert wrote:
> On Mon, Jan 11, 2010 at 12:34 PM, Steven D'Aprano
> <steve at remove-this-cybersource.com.au> wrote: <snip>
>> If you can avoid regexes in favour of ordinary string methods, do so.
>> In general, something like:
>> source.replace(target, new)
>> will potentially be much faster than:
>> regex = re.compile(target)
>> regex.sub(new, source)
>> # equivalent to re.sub(target, new, source)
>> (assuming of course that target is just a plain string with no regex
>> specialness). If you're just cracking a peanut, you probably don't need
>> the 30 lb sledgehammer of regular expressions.
> Of course, but is the regex library really not smart enough to
> special-case and optimize vanilla string substitutions?
Apparently not in Python 2.5:
>>> from timeit import Timer
>>> t1 = Timer('x.sub("Dutch", "Nobody expects the Spanish
... 'from re import compile; x = compile("Spanish")')
>>> t2 = Timer('x.replace("Spanish", "Dutch")',
... 'x="Nobody expects the Spanish Inquisition!"')
[3.7209370136260986, 2.7262279987335205, 2.6416280269622803]
[2.2915709018707275, 1.2584249973297119, 1.2730350494384766]
Even if it did, I wouldn't rely on that sort of special casing unless the
language guaranteed it. Keep in mind that regexes are essentially a
programming language (although not Turing Complete), and the engine
implementation may choose purity and simplicity over such optimizations.
More information about the Python-list