Regex recursion error example.

Yin yin_12180 at yahoo.com
Fri Nov 1 10:13:20 EST 2002


After tinkering with this issue for a day or so, I've decided to use
xmllib to solve the problem.  But for future reference, I've attached
the piece of text that is failing and the two approaches that I've
tried to make the match.

Of course there are other approaches to doing this parse, but I am
interested in understanding the regex approach I am trying and its
limitations.

If there are no solutions using regex, I would be interested in seeing
a reference to articles or books that discuss overcoming particularly
long string matches.

Approach 1:
pattern=re.compile('<PubMedArticle>(.*?)</PubMedArticle>',
re.DOTALL)
self.citationlist = re.findall(pattern, allinput)

Approach 2:
comppat=re.compile(r'<PubMedArticle>((?:(?!<PubMedArticle>).)*)</PubMedArticle>',
re.DOTALL)
self.citationlist = re.findall(pattern, allinput)

There are three matching to make in this body of text.  The above code
has been failing on the second of the third.  This problem has only
been occuring on linux python and Windows python (the stack in Windows
is just larger enough to accomadate the matches.
Text to match:

http://160.129.203.97/1998_xmltest.html

Please let me know by e-mail if the link is down.

Thanks again,
Yin



More information about the Python-list mailing list