[Python-3000] Questions about email bytes/str (python 3000)
victor.stinner at haypocalc.com
Tue Aug 14 04:22:36 CEST 2007
After many tests, I'm unable to convert email module to Python 3000. I'm also
unable to take decision of the best type for some contents.
(1) Email parts should be stored as byte or character string?
Related methods: Generator class, Message.get_payload(), Message.as_string().
Let's take an example: multipart (MIME) email with latin-1 and base64 (ascii)
sections. Mix latin-1 and ascii => mix bytes. So the best type should be
(2) Parsing file (raw string): use bytes or str in parsing?
The parser use methods related to str like splitlines(), lower(), strip(). But
it should be easy to rewrite/avoid these methods. I think that low-level
parsing should be done on bytes. At the end, or when we know the charset, we
can convert to str.
About base64, I agree with Bill Janssen:
- base64MIME.decode converts string to bytes
- base64MIME.encode converts bytes to string
But decode may accept bytes as input (as base64 modules does): use
str(value, 'ascii', 'ignore') or str(value, 'ascii', 'strict').
I wrote 4 differents (non-working) patches. So I you want to work on email
module and Python 3000, please first contact me. When I will get a better
patch, I will submit it.
Victor Stinner aka haypo
More information about the Python-3000