[Spambayes] Mailbox class in the spambayes project & python 2.2.1

Alexander Leidinger Alexander@Leidinger.net
Wed, 25 Sep 2002 18:14:33 +0200


On Wed, 25 Sep 2002 11:58:36 -0400 Guido van Rossum <guido@python.org>
wrote:

> Content-Type: multipart/signed; micalg*=ansi-x3-4-1968''pgp-md5;
> 	protocol*=ansi-x3-4-1968''application%2Fpgp-signature;
> 	boundary*="ansi-x3-4-1968''EeQfGwPcQSOJBaQU"

[...]

>     def get_boundary(self, failobj=None):
>         """Return the boundary associated with the payload if present.
> 
>         The boundary is extracted from the Content-Type: header's
>         `boundary' parameter, and it is unquoted.
>         """
>         missing = []
>         boundary = self.get_param('boundary', missing)
>         if boundary is missing:
>             return failobj
>         return _unquotevalue(boundary.strip())
> 
> This fails because in this case, boundary is the tuple
> ('ansi-x3-4-1968', '', 'EeQfGwPcQSOJBaQU') where the actual boundary
> is the third element.  (The first is an encoding, a fancy name for
> ASCII AFAICT; the second is an optional natural language.)
> 
> Barry isn't sure whether get_boundary() should expect such a tuple
> here or whether get_param() should return the 3rd item of the tuple.

IMHO expecting such a tuple in get_boundary() (and perhaps propagating
it to upper layers) seems logical, you may be able to get useful
additional information out of it (compared to trowing it away in
get_baram()).

> I get other failures in Hammie on the content-type; the code (part of
> crach_content_xyz() in spambayes/tokenizer.py) is this:
> 
>     for x in msg.get_charsets(None):
>         if x is not None:
>             yield 'charset:' + x.lower()
> 
> where the code doesn't expect a triple of the form ('ansi-x3-4-1968',
> '', 'us-ascii').  I don't know yet whether get_charsets() should have
> returned the third tuple item or whether crack_content_xyz() should be
> fixed to expect a tuple.

This depends. Do you want to add get_charset_encoding() and
get_charset_fancy_name()?
 no: crack_content_xyz() should understand the triple.
 yes: get_charset() should return the third element.

Bye,
Alexander.

-- 
              The best things in life are free, but the
                expensive ones are still worth a look.

http://www.Leidinger.net                       Alexander @ Leidinger.net
  GPG fingerprint = C518 BC70 E67F 143F BE91  3365 79E2 9C60 B006 3FE7