[Tutor] String matching?

orbitz orbitz at ezabel.com
Tue Dec 7 15:03:45 CET 2004


Instead of copying and pasting and then just doing a simple match, why 
not use urllib2 to download the html and then run through it with HTMLParse?

Liam Clarke wrote:

>Hi all, 
>
>I have a large amount of HTML that a previous person has liberally
>sprinkled a huge amount of applets through, instead of html links,
>which kills my browser to open.
>
>So, want to go through and replace all applets with nice simple links,
>and want to use Python to find the applet, extract a name and an URL,
>and create the link.
>
>My problem is, somewhere in my copying and pasting into the text file
>that the HTMl currently resides in, it got all messed up it would
>seem, and there's a bunch of strange '=' all through it. (Someone said
>that the code had been generated in Frontpage. Is that a good thing or
>bad thing?)
>
>So, I want to search for <applet code=, but it may be in the file as 
>
><app=
>let
> code
>
>or <applet
>        code
>
>or <ap=
>plet 
>
>etc. etc. (Full example of yuck here
>http://www.rafb.net/paste/results/WcKPCy64.html)
>
>So, I want to be write a search that will match <applet code and
><app=\nlet code (etc. etc.) without having to strip the file of '='
>and '\n'.
>
>I was thinking the re module is for this sort of stuff? Truth is, I
>wouldn't know where to begin with it, it seems somewhat powerful.
>
>Or, there's a much easier way, which I'm missing totally. If there is,
>I'd be very grateful for pointers.
>
>Thanks for any help you can offer.
>
>Liam Clarke
>
>  
>



More information about the Tutor mailing list