[Tutor] String matching?

orbitz orbitz at ezabel.com
Tue Dec 7 15:03:45 CET 2004

Instead of copying and pasting and then just doing a simple match, why 
not use urllib2 to download the html and then run through it with HTMLParse?

Liam Clarke wrote:

>Hi all, 
>I have a large amount of HTML that a previous person has liberally
>sprinkled a huge amount of applets through, instead of html links,
>which kills my browser to open.
>So, want to go through and replace all applets with nice simple links,
>and want to use Python to find the applet, extract a name and an URL,
>and create the link.
>My problem is, somewhere in my copying and pasting into the text file
>that the HTMl currently resides in, it got all messed up it would
>seem, and there's a bunch of strange '=' all through it. (Someone said
>that the code had been generated in Frontpage. Is that a good thing or
>bad thing?)
>So, I want to search for <applet code=, but it may be in the file as 
> code
>or <applet
>        code
>or <ap=
>etc. etc. (Full example of yuck here
>So, I want to be write a search that will match <applet code and
><app=\nlet code (etc. etc.) without having to strip the file of '='
>and '\n'.
>I was thinking the re module is for this sort of stuff? Truth is, I
>wouldn't know where to begin with it, it seems somewhat powerful.
>Or, there's a much easier way, which I'm missing totally. If there is,
>I'd be very grateful for pointers.
>Thanks for any help you can offer.
>Liam Clarke

More information about the Tutor mailing list