HTMLParser.HTMLParseError: EOF in middle of construct
Sérgio Monteiro Basto
sergio at sergiomb.no-ip.org
Tue Jun 19 21:09:32 EDT 2007
Stefan Behnel wrote:
> Sérgio Monteiro Basto wrote:
>> but is one single error that blocks this.
>> Finally I found it , it is :
>> <td colspan="2"align="center"
>> if I put :
>> <td colspan="2" align="center"
>>
>> p = re.compile('"align')
>> content = p.sub('" align', content)
>>
>> I can parse the html
>> I don't know if it a bug of HTMLParser
>
> Sure, and next time your key doesn't open your neighbours house, please
> report to the building company to have them fix the door.
>
The question, here, is if
<td colspan="2"align="center"
is valid HTML or not ?
I think is valid , if so it's a bug on HTMLParser
if not, we still have a very bad message error (EOF in middle of
construct !?)
I have to use HTMLParser because I want use only python 2.4 standard , I
have to install the scripts in many machines.
And I have to parse many different sites, I just want extract the links, so
with a clean up before parse solve very quickly my problem.
Thanks,
--
Sérgio M. B.
More information about the Python-list
mailing list