[Tutor] regular expression
Rich Krauter
rmkrauter at yahoo.com
Thu Feb 12 22:27:15 EST 2004
On Thu, 2004-02-12 at 20:50, Conrad Koziol wrote:
> What is the fastest way to search for a string and then surround it
> <code> and </code> with something. Like so:
>
> x = '<div> are not allowed, these arent either <br>'
> <some code here>
> x = '<code><div></code> are not allowed, these arent either
> <code><br></code>'
>
> The two ways this can be done is by subsituting the string like <div>
> with <code><div></code> or inserting <code> and </code> before and after
> it. Which one would be faster and how would I do it? I got as far as
> creating the regular expression r'<[^<>]*>'
>
> Thanks!!
>
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
Hi Conrad,
This seems to work. Don't know about speed, and its *not* thoroughly
tested:
import re
x = '<div> are not allowed, these arent either <br>'
pat = r'(?<!<code>)((?!<.*code>)<[^<>]*>)(?!</code>)'
rep = r'<code>\1</code>'
(a,na) = re.subn(pat,rep,x)
print a
# Next line is ok since I used
# negative lookaheads and negative lookbehinds.
# Without them, you'd get stuff like
# <code><code></code><br></code><code></code>
# if you run subn multiple times
(b,nb) = re.subn(pat,rep,a)
print b
Hope that helps. FYI, I referred to 'Text Processing In Python' by David
Mertz to try to figure this out.
Rich
More information about the Tutor
mailing list