[New-bugs-announce] [issue4766] email documentation needs to be precise about strings/bytes
David M. Beazley
report at bugs.python.org
Mon Dec 29 16:21:49 CET 2008
New submission from David M. Beazley <beazley at users.sourceforge.net>:
Documentation for the email package needs to be more clear about the
usage of strings and bytes. In particular:
1. All operations that parse email messages such as message_from_file()
or message_from_string() operate on *text*, not binary data. So,
the file must be opened in text mode. Strings must be text strings,
not binary strings.
2. All operations that set/get the payload of a message operate on
byte strings. For example, using m.get_payload() on a Message
object returns binary data as a byte string.
Opinion: There might be other bug reports about this, but I'm not
advocating that the email module should support reading messages from
binary mode files or byte strings. Email and MIME were originally
developed with the assumption that messages would always be handled as
text. Minimally, this assumed that messages would stay intact even if
processed as 7-bit ASCII. By extension, everything should still work
if processed as Unicode. So, I think the use of text-mode files is
entirely consistent with this if you wanted to keep the module "as is."
There may be some confusion on this matter because if you're reading or
writing email messages (or sending them across a socket), you may
encounter messages stored in the form of bytes strings instead of text.
People will then wonder why a byte string can't be parsed by this module
(especially given that email messages only use character values in the
range of 0-127).
nosy: beazley, georg.brandl
title: email documentation needs to be precise about strings/bytes
versions: Python 3.0
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce