beautifulsoup .vs tidy
bruce
bedouglas at earthlink.net
Sat Jul 1 11:26:59 EDT 2006
hi paddy...
that's exactly what i'm trying to accomplish... i've used tidy, but it seems
to still generate warnings...
initFile -> tidy ->cleanFile -> perl app (using xpath/livxml)
the xpath/linxml functions in the perl app complain regarding the file. my
thought is that tidy isn't cleaning enough, or that the perl xpath/libxml
functions are too strict!
which is why i decided to see if anyone on the python side has
experienced/solved this problem..
-bruce
-----Original Message-----
From: python-list-bounces+bedouglas=earthlink.net at python.org
[mailto:python-list-bounces+bedouglas=earthlink.net at python.org]On Behalf
Of Paddy
Sent: Saturday, July 01, 2006 1:09 AM
To: python-list at python.org
Subject: Re: beautifulsoup .vs tidy
bruce wrote:
> hi...
>
> never used perl, but i have an issue trying to resolve some html that
> appears to be "dirty/malformed" regarding the overall structure. in
> researching validators, i came across the beautifulsoup app and wanted to
> know if anybody could give me pros/cons of the app as it relates to any of
> the other validation apps...
>
I'm not too sure of what you are after. You mention tidy in the subject
which made me think that maybe you were trying to generate well-formed
HTML from malformed webppages that nonetheless browsers can interpret.
If that is the case then try HTML tidy:
http://www.w3.org/People/Raggett/tidy/
- Pad.
--
http://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list