[Spambayes] Heads up! Tokenizer changes

Tim Peters tim_one at email.msn.com
Wed May 21 00:29:40 EDT 2003


[Skip Montanaro]
> It seems to me that the simplest way to punt and guarantee a parseable
> message would be to simply change the Content-Type of the message to
> text/plain.  You obviously make some compromises (like
> multipart-alternative messages would have make duplicate tokens), but
> it should be guaranteed to parse, shouldn't it?

The failures here come *from* message_from_string() -- there is no Message
object to work with at this point.  Sometimes the body can't be parsed, and
sometimes not even the headers can be parsed.  Workarounds are needed for
both kinds of errors, and some are in place, but they're half-hearted and
scattered around the codebase now.  mboxutils.get_message() is the best
current routine to build on.




More information about the Spambayes mailing list