Regex recursion error example.
Yin
yin_12180 at yahoo.com
Fri Nov 1 10:13:20 EST 2002
After tinkering with this issue for a day or so, I've decided to use
xmllib to solve the problem. But for future reference, I've attached
the piece of text that is failing and the two approaches that I've
tried to make the match.
Of course there are other approaches to doing this parse, but I am
interested in understanding the regex approach I am trying and its
limitations.
If there are no solutions using regex, I would be interested in seeing
a reference to articles or books that discuss overcoming particularly
long string matches.
Approach 1:
pattern=re.compile('<PubMedArticle>(.*?)</PubMedArticle>',
re.DOTALL)
self.citationlist = re.findall(pattern, allinput)
Approach 2:
comppat=re.compile(r'<PubMedArticle>((?:(?!<PubMedArticle>).)*)</PubMedArticle>',
re.DOTALL)
self.citationlist = re.findall(pattern, allinput)
There are three matching to make in this body of text. The above code
has been failing on the second of the third. This problem has only
been occuring on linux python and Windows python (the stack in Windows
is just larger enough to accomadate the matches.
Text to match:
http://160.129.203.97/1998_xmltest.html
Please let me know by e-mail if the link is down.
Thanks again,
Yin
More information about the Python-list
mailing list