[python-win32] regular expressions question
David.Cantrell@Gunter.AF.mil
David.Cantrell@Gunter.AF.mil
Wed, 14 Aug 2002 12:37:22 -0500
Hi all,
Do you know a way to wrap regular expressions around newlines, but STOP when
a certain pattern is reached?
In other words, given the following source:
Some leading text here.
Block 1
Symbol: text here
Symbol: text here
Symbol: text here
Other stuff here
End Block
Block 2
Symbol: text here
Symbol: text here
Symbol: text here
Other stuff here
End Block
Some trailing text here.
(I'm parsing VBScript files and extracting method comments, but the above is
simpler to deal with)
I have a regexp that retrieves a list of all Blocks, so given the above the
list looks like:
methodlist = [ "Block 1", "Block 2" ]
If I use re.DOTALL:
for item in methodlist:
print item, "\n-----\n"
print re.search( item + ".*End Block", s, re.DOTALL
).group()
print "\n"
I get the following (of course):
Block 1
-----
Block 1
Symbol: text here
Symbol: text here
Symbol: text here
Other stuff here
End Block
Block 2
Symbol: text here
Symbol: text here
Symbol: text here
Other stuff here
End Block
Block 2
-----
Block 2
Symbol: text here
Symbol: text here
Symbol: text here
Other stuff here
End Block
But I eventually want to build a list that looks like this:
[ ( "Block 1",
"Symbol: text here\nSymbol: text here\nSymbol:
text here"
)
( "Block 2",
"Symbol: text here\nSymbol: text here\nSymbol:
text here"
)
]
In order to do that, I need to know how to make the regexp engine STOP once
it gets past the last "Symbol: " line after each Block declaration.
(I know the regexp I gave goes from Block..End Block, but that's only
because I don't know how to "get all Symbol lines that come immediately
after a Block declaration")
Any help is much appreciated!! :D
Thanks,
-dave