simplest way to strip a comment from the end of a line?
Paul McGuire
ptmcg at austin.rr.com
Thu Dec 4 17:35:42 EST 2008
Yowza! My eyes glaze over when I see re's like "r'(?m)^(?P<data>.*?
(".*?".*?)*)(?:#.*?)?$"!
Here's a simple recognizer that reads source code and suppresses
comments. A comment will be a '#' character followed by the rest of
the line. We need the recognizer to also detect quoted strings, so
that any would-be '#' comment introducers that are in a quoted string
*wont* incur the stripping wrath of the recognizer. A quoted string
must be recognized before recognizing a '#' comment introducer.
With our input tests given as:
tests ='''this is a test 1
this is a test 2 #with a comment
this is a '#gnarlier' test #with a comment
this is a "#gnarlier" test #with a comment
'''.splitlines()
here is such a recognizer implemented using pyparsing.
from pyparsing import quotedString, Suppress, restOfLine
comment = Suppress('#' + restOfLine)
recognizer = quotedString | comment
for t in tests:
print t
print recognizer.transformString(t)
print
Prints:
this is a test 1
this is a test 1
this is a test 2 #with a comment
this is a test 2
this is a '#gnarlier' test #with a comment
this is a '#gnarlier' test
this is a "#gnarlier" test #with a comment
this is a "#gnarlier" test
For some added fun, add a parse action to quoted strings, to know when
we've really done something interesting:
def detectGnarliness(tokens):
if '#' in tokens[0]:
print "Ooooh, how gnarly! ->", tokens[0]
quotedString.setParseAction(detectGnarliness)
Now our output becomes:
this is a test 1
this is a test 1
this is a test 2 #with a comment
this is a test 2
this is a '#gnarlier' test #with a comment
Ooooh, how gnarly! -> '#gnarlier'
this is a '#gnarlier' test
this is a "#gnarlier" test #with a comment
Ooooh, how gnarly! -> "#gnarlier"
this is a "#gnarlier" test
-- Paul
More information about the Python-list
mailing list