remove strings from source
M.E.Farmer
mefjr75 at hotmail.com
Sat Feb 26 18:59:44 EST 2005
qwweeeit wrote:
> Thank you for your suggestion, but it is too complicated for me...
> I decided to proceed in steps:
> 1. Take away all commented lines
> 2. Rebuild the multi-lines as single lines
ummm,
Ok all i can say is did you try this?
if not save it as a module then import it into the interperter and try
it.
This is a dead simple module to do *exactly* what you asked for :)
Like i said I have done this before so I will restate *I HAVE FAILED AT
THIS BEFORE, MANY TIMES*. Now I have a solution.
It handles stdio by default but can write to a filelike object if you
give it one.
Handles continued lines already, no need to futz around with some
solution.
Here is an example:
Py> filein = """
... class Stripper:
... '''python comment and whitespace stripper
... '''
... def __init__(self, raw):
... ''' Store the source text & set some flags.
... '''
... self.raw = raw
...
... def format(self, out=sys.stdout, comments=0,
... spaces=1, untabify=1,eol='unix'):
... '''Parse and send the colored source.'''
... # Store line offsets in self.lines
... self.lines = [0, 0]
... pos = 0
... # Strips the first blank line if 1
... self.lasttoken = 1
... self.temp = StringIO.StringIO()
... self.spaces = spaces
... self.comments = comments
...
... if untabify:
... self.raw = self.raw.expandtabs()
... self.raw = self.raw.rstrip()+' '
... self.out = out
... """
Py> replacer = ReplaceParser(filein, out=sys.stdout)
Py> replacer.format()
class Stripper:
s000001
def __init__(self, raw):
s000002
self.raw = raw
def format(self, out=sys.stdout, comments=0,
spaces=1, untabify=1,eol=s000003):
s000004
# Store line offsets in self.lines
self.lines = [0, 0]
pos = 0
# Strips the first blank line if 1
self.lasttoken = 1
self.temp = StringIO.StringIO()
self.spaces = spaces
self.comments = comments
if untabify:
self.raw = self.raw.expandtabs()
self.raw = self.raw.rstrip()+s000005
self.out = out
Py> replacer.StringMap
{'s000004': "'''Parse and send the colored source.'''",
's000005': "' '",
's000001': "'''python comment and whitespace stripper :)\n '''",
's000002': "''' Store the source text & set some flags.\n '''",
's000003': "'unix'"}
You can also strip out comments with a few line.
It can easily get single comments or doubles.
add this in your __call__ function:
[snip]
self.pos = newpos
return
# kills comments
if (toktype == tokenize.COMMENT):
return
if (toktype == token.STRING):
sname = self.StringName.next()
[snip]
If you insist on writing something go ahead.
Let me know what your solution is, I am curious.
M.E.Farmer
More information about the Python-list
mailing list