Bottleneck? More efficient regular expression?

Tina Li tina_li23AThotmailDOTcom
Tue Sep 23 17:40:18 CEST 2003


I've been struggling with a regular expression for parsing XML files, which keeps giving the run time error "maximum
recursion limit exceeded". Here is the pattern string:


The file format is straighforward. Here is a sample:

<targetSeq name="1onc">blah
<alignment size="335">
<align> :| ..| :    .  |  .                         |.  .  :</align>

# this group of tags then repeat in the file multiple times

If I search for the pattern up to "</template>" (i.e. no <anotherTag> onwards), it works fine. As soon as I added the
later bits into the pattern it gives the error.

I heard that non-greedy (*?) is inefficient, so I tried replacing all .*? with (?!<target>) etc. which means "if the the
next piece of text doesn't match the <target> tag keep going". But it gives the same error.

So my question is: what is the bottleneck in this pattern? Could someone more experienced in REs give some hints here?

Your help is greatly appreciated!


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =----- - The #1 Newsgroup Service in the World!
-----==  Over 100,000 Newsgroups - 19 Different Servers! =-----

More information about the Python-list mailing list