Reg Exp: Need advice concerning "greediness"

Franz GEIGER fgeiger at datec.at
Sun Oct 1 11:56:45 EDT 2000


Thanks for your response!

> For me, using python2.0, I get this answer

Sorry, I forgot to mention the platform: Python 1.5.2 on NT4.0/SP 5.

> the htmllib module, imo.  It'll take you about the same amount of time

Sounds promising. It's a Python std module, isn't it? Yet I could not find
sample scripts  showing me how to use it? Any idea how to begin with?

Thanks in advance
Franz GEIGER


Alex <the_brain at mit.edu> schrieb in im Newsbeitrag:
etdu2aykkd1.fsf at mint-square.mit.edu...
>
> > I tried:
> > sRslt = "<h1><font COLOR="#FF0000">Heading Level 1</font></h1>";
> > print re.findall(re.compile(r'(.*?FONT.*?)(COLOR=.*?)*([ |>].*)', re.I |
> > re.S), sRslt);
> >
> > This returns [("<h1><font, , COLOR="#FF0000">Heading Level
1</font></h1>)].
> > I'd expected to receive [("<h1><font , COLOR="#FF0000", >Heading Level
> > 1</font></h1>)].
>
> For me, using python2.0, I get this answer
>
> [('<h1><font', '', ' COLOR=')]
>
> which is different from what you got, and what you expected.  Also, what
> you got is not syntactically correct, I think.  Could you paste the
> output directly from the interpreter?
>
> In general, for this sort of thing, you are better off learning to use
> the htmllib module, imo.  It'll take you about the same amount of time
> to learn it this time as to get the regexp correct, and you'll have a
> far more appropriate framework for the next such problem that comes
> along.
>
> Alex.
>
> --
> Speak softly but carry a big carrot.
>





More information about the Python-list mailing list