Regular Expression for Finding and Deleting comments
MRAB
python at mrabarnett.plus.com
Tue Jan 4 13:26:48 EST 2011
On 04/01/2011 17:11, Jeremy wrote:
> I am trying to write a regular expression that finds and deletes (replaces with nothing) comments in a string/file. Comments are defined by the first non-whitespace character is a 'c' or a dollar sign somewhere in the line. I want to replace these comments with nothing which isn't too hard. The trouble is, the comments are replaced with a new-line; or the new-line isn't captured in the regular expression.
>
> Below, I have copied a minimal example. Can someone help?
>
> Thanks,
> Jeremy
>
>
> import re
>
> text = """ c
> C - Second full line comment (first comment had no text)
> c Third full line comment
> F44:N 2 $ Inline comments start with dollar sign and go to end of line"""
>
> commentPattern = re.compile("""
> (^\s*?c\s*?.*?| # Comment start with c or C
> \$.*?)$\n # Comment starting with $
> """, re.VERBOSE|re.MULTILINE|re.IGNORECASE)
>
Part of the problem is that you're not using raw string literals or
doubling the backslashes.
Try soemthing like this:
commentPattern = re.compile(r"""
(^[ \t]*c.*\n| # Comment start with c or C
[ \t]*\$.*) # Comment starting with $
""", re.VERBOSE|re.MULTILINE|re.IGNORECASE)
> found = commentPattern.finditer(text)
>
> print("\n\nCard:\n--------------\n%s\n------------------" %text)
>
> if found:
> print("\nI found the following:")
> for f in found: print(f.groups())
>
> else:
> print("\nNot Found")
>
> print("\n\nComments replaced with ''")
> replaced = commentPattern.sub('', text)
> print("--------------\n%s\n------------------" %replaced)
>
More information about the Python-list
mailing list