Regex substitution trouble
Cameron Simpson
cs at zip.com.au
Tue Oct 28 22:14:54 EDT 2014
On 28Oct2014 04:02, massi_srb at msn.com <massi_srb at msn.com> wrote:
>I'm not really sure if this is the right place to ask about regular
>expressions, but since I'm usin python I thought I could give a try :-)
>Here is the problem, I'm trying to write a regex in order to substitute all the occurences in the form $"somechars" with another string. This is what I wrote:
>
>newstring = re.sub(ur"""(?u)(\$\"[\s\w]+\")""", subst, oldstring)
>
>This works pretty well, but it has a problem, I would need it also to handle the case in which the internal string contains the double quotes, but only if preceeded by a backslash, that is something like $"somechars_with\\"doublequotes".
>Can anyone help me to correct it?
People seem to be making this harder than it should be.
I'd just be fixing up your definition of what's inside the quotes. There seem
to be 3 kinds of things:
- not a double quote or backslash
- a backslash followed by a double quote
- a backslash followed by not a double quote
Kind 3 is a policy call - take the following character or not? I would go with
treating it like kind 2 myself.
So you have:
1 [^\\"]
2 \\"
3 \\[^"]
and fold 2 and 3 into:
2+3 \\.
So your regexp inner becomes:
([^\\"]|\\.)*
and the whole thing becomes:
\$"(([^\\"]|\\.)*)"
and as a raw string:
ur'\$"(([^\\"]|\\.)*)"'
choosing single quotes to be more readable given the double quotes in the
regexp.
Cheers,
Cameron Simpson <cs at zip.com.au>
--
cat: /Users/cameron/rc/mail/signature.: No such file or directory
Language... has created the word "loneliness" to express the pain of
being alone. And it has created the word "solitude" to express the glory
of being alone. - Paul Johannes Tillich
More information about the Python-list
mailing list