[Python-3000] Questions about email bytes/str (python 3000)
Bill Janssen
janssen at parc.com
Wed Aug 15 03:44:54 CEST 2007
> > Let's take an example: multipart (MIME) email with latin-1 and
> > base64 (ascii)
> > sections. Mix latin-1 and ascii => mix bytes. So the best type
> > should be
> > bytes.
> >
> > => bytes
>
> Except that by the time they're parsed into an email message, they
> must be ascii, either encoded as base64 or quoted-printable. We also
> have to know at that point the charset being used, so I think it
> makes sense to keep everything as strings.
Actually, Victor's right here -- it makes more sense to treat them as
bytes. It's RFC 821 (SMTP) that requires 7-bit ASCII, not the MIME
format. Non-SMTP mail transports do exist, and are popular in various
places. Email transported via other transport mechanisms may, for
instance, use a Content-Transfer-Encoding of "binary" for some
sections of the message. Some parts of the top-most header of the
message may be counted on to be encoded as ASCII strings, but not the
whole message in general.
> > About base64, I agree with Bill Janssen:
> > - base64MIME.decode converts string to bytes
> > - base64MIME.encode converts bytes to string
>
> I agree.
>
> > But decode may accept bytes as input (as base64 modules does): use
> > str(value, 'ascii', 'ignore') or str(value, 'ascii', 'strict').
>
> Hmm, I'm not sure about this, but I think that .encode() may have to
> accept strings.
Personally, I think it would avoid more errors if it didn't. Let the
user explicitly encode the string to a particular representation
before calling base64.encode().
Bill
More information about the Python-3000
mailing list