Parsing HTML - modify URLs
fumanchu at amor.org
Wed Jul 7 16:47:02 CEST 2004
> I am trying to parse an HTML page an only modify URLs within tags -
> e.g. inside IMG, A, SCRIPT, FRAME tags etc...
> I have built one that works fine using the HTMLParser.HTMLParser and
> it works fine.... on good HTML. Having done a google it looks like
> parsing dodgy HTML and having HTMLParser choke is a common theme.
Haven't used it, but Beautiful Soup sounds like it fits the bill:
More information about the Python-list