problem with re.MULTILINE

MRAB python at mrabarnett.plus.com
Sun Oct 18 15:20:20 EDT 2009


Necronymouse wrote:
> Hello i ´ve got a little problem: I ´ve this text:
> http://openpaste.org/en/secret/17343/pass-python and I need to parse
> it. So i wrote this:
> 
> patternNode = re.compile("""
> # Node (\w*).*
> (.*)""", re.MULTILINE)
> 
> 
> with open("test.msg", "r") as file:
>     testData = file.read()
> 
> for Node in re.findall(patternNode, testData):
>     print "Node:", Node[0]
>     print Node
> <<<
> 
> but it prints only one line from text. If i am using re.DOTALL it
> wouldn´t print anything.. So don´t you know whre the problem is?
> 
I assume you mean that it's giving you only the first line of text of
each node.

"(.*)" will capture a single (and possibly empty) line of text.

"(.+\n)" will capture a single non-empty line of text ending with a
newline.

I think you want to capture multiple non-empty lines, each line ending
with a newline:

patternNode = re.compile("""
# Node (\w*).*
((?:.+\n)*)""", re.MULTILINE)

> Sorry for my English - it´s not my native language...

It's better than my Czech/Slovak (depending on what Google says)! :-)



More information about the Python-list mailing list