[Spambayes] It gets funnier all the time....
Tim Stone - Four Stones Expressions
tim at fourstonesExpressions.com
Wed Feb 12 15:38:24 EST 2003
No cte header is probably a clue... Problem here is that this happens before
html stripping, and so the first word could conceivably be > 60, if it's an
anchor tag with a long url or something... I think I like your original idea
of checking for base64 character match... - TimS
2/12/2003 2:08:35 PM, Skip Montanaro <skip at pobox.com> wrote:
> Tim> Ya, I just discovered that too... so back to the original mail that
> Tim> started this whole thing. Was that not base64? - TimS
>What about this? No c-t-e header, so the base64 crap will come back
>unchanged. If the first "word" of the decoded payload is longer than 60
>characters, feed to the base64 fixer-upper:
>*** /tmp/skip/tokenizer.py.~1.4~ Wed Feb 12 14:06:51 2003
>--- /tmp/skip/tokenizer.py Wed Feb 12 14:06:51 2003
>*** 1331,1336 ****
>--- 1331,1339 ----
> # Decode, or take it as-is if decoding fails.
> text = part.get_payload(decode=True)
>+ if len(text.split()) > 60:
>+ # just in case it's encoded but no c-t-e header was
>+ yield "control: no cte header"
>+ text = try_to_repair_damaged_base64(text)
> yield "control: couldn't decode"
> text = part.get_payload(decode=False)
c'est moi - TimS
More information about the Spambayes