HTML Parser

Kragen Sitaker kragen at
Sun Dec 31 00:02:52 EST 2000

In article <qnkito1p9jj.fsf at>,
David M. Cooke <cookedm at> wrote:
>At some point, kragen at (Kragen Sitaker) wrote:
>> In a string like "x<a>b<c>d", this will match "<a>b<c>", because the .*
>> matches "a>b<c".  This explains your problem.
>> Fixing it is harder.
>Not that hard: use the pattern '<.*?>'.

Well, he wants to upcase his tag names; this will still match the
entire attribute name and all attribute values, so his URLs will get
upcased.  This is bad, and fixing it *is* harder.

<kragen at>       Kragen Sitaker     <>
Perilous to all of us are the devices of an art deeper than we possess
       -- Gandalf the White [J.R.R. Tolkien, "The Two Towers", Bk 3, Ch. XI]

More information about the Python-list mailing list