html parser , unexpected '<' char in declaration
Jesus Rivero (Neurogeek)
jrivero at latinux.org
Tue Feb 21 13:45:40 EST 2006
Oopss!
You are totally right guys, i did miss the closing '>' thinking about
maybe errors in the use of ' or ".
Jesus
Tim Roberts wrote:
>"Jesus Rivero - (Neurogeek)" <jrivero at latinux.org> wrote:
>
>
>>hmmm, that's kind of different issue then.
>>
>>I can guess, from the error you pasted earlier, that the problem shown
>>is due to the fact Python is interpreting a "<" as an expression and not
>>as a char. review your code or try to figure out the exact input you're
>>receving within the mta.
>>
>>
>
>Well, Jesus, you are 0 for 2. Sakcee pointed out what the exact problem
>was in his original message. The HTML he is being given is ill-formed; the
><!DOCTYPE directive is not closed. The SGML parser finds a <html> tag
>which it thinks is inside the <!DOCTYPE, and that's illegal.
>
>
>
>>>well probabbly I should explain more. this is part of an email . after
>>>the mta delivers the email, it is stored in a local dir.
>>>After that the email is being parsed by the parser inside an web based
>>>imap client at display time.
>>>
>>>I dont think I have the choice of rewriting the message!? and I dont
>>>want to reject the message alltogether.
>>>
>>>I can either 1-fix the incoming html by tidying it up
>>>or 2- strip only plain text out and dispaly that you have spam, 3 - or
>>>ignore that mal-formatted tag and display the rest
>>>
>>>
>
>If this is happening with more than one message, you could check for it
>rather easily with a regular expression, or even just ''.find, and then
>either insert a closing '>' or delete everything up to the <html> before
>parsing it.
>
>
More information about the Python-list
mailing list