HTMLParser tag contents
g2 at seebelow.org
Mon May 8 19:42:55 CEST 2000
In article <byiR4.6194$Za1.95070 at newsc.telia.net>, "Fredrik says...
>Grant Griffin <g2 at seebelow.org> wrote:
>> I experimented with your approach and it worked, but I finally decided
>> just to use 're', to preserve HTMLParser's other features. Here's what
>> I came up with:
>cannot really figure out what you're trying to do, but I can assure
>you that I wouldn't have done it that way...
(Oops! Did is this comp.lang.perl.misc?--or does comp.lang.python also provide
free newbie-mugging-service? ;-)
>what exactly are those "other features" you want to preserve?
Extracting the title and metas, for example.
The dspGuru site currently has about 60 pages, all of which have been created
manually. I now need to change the style of all the headers and footers.
Also, in the future, I would like to be able to create new pages without headers
and footers using a standard HTML editor.
To that end, I have been planning to conscript HTMLParser and HTMLGen to build
new pages from "source" pages. The title and metas of the sources will be
combined with program-generated headers/footers, and the stuff inside the
sources' "<body>" tag (which HTMLGen will receive as "RawText"). The
"HTMLParserEx" class I had posted was a starting point for all that.
(So what's to not figure out? ;-)
Grant R. Griffin g2 at dspguru.com
Publisher of dspGuru http://www.dspguru.com
Iowegian International Corporation http://www.iowegian.com
More information about the Python-list