[Spambayes] It gets funnier all the time....

Wed Feb 12 15:38:24 EST 2003

No cte header is probably a clue...  Problem here is that this happens before 
html stripping, and so the first word could conceivably be > 60, if it's an 
anchor tag with a long url or something...  I think I like your original idea 
of checking for base64 character match...  - TimS

2/12/2003 2:08:35 PM, Skip Montanaro <skip at pobox.com> wrote:

>
>    Tim> Ya, I just discovered that too... so back to the original mail that
>    Tim> started this whole thing.  Was that not base64?  - TimS
>
>What about this?  No c-t-e header, so the base64 crap will come back
>unchanged.  If the first "word" of the decoded payload is longer than 60
>characters, feed to the base64 fixer-upper:
>
>*** /tmp/skip/tokenizer.py.~1.4~        Wed Feb 12 14:06:51 2003
>--- /tmp/skip/tokenizer.py      Wed Feb 12 14:06:51 2003
>***************
>*** 1331,1336 ****
>--- 1331,1339 ----
>              # Decode, or take it as-is if decoding fails.
>              try:
>                  text = part.get_payload(decode=True)
>+                 if len(text.split()[0]) > 60:
>+                     # just in case it's encoded but no c-t-e header was 
given
>+                     yield "control: no cte header"
>+                     text = try_to_repair_damaged_base64(text)
>              except:
>                  yield "control: couldn't decode"
>                  text = part.get_payload(decode=False)
>
>Skip
>
>

c'est moi - TimS
http://www.fourstonesExpressions.com
http://wecanstopspam.org