[New-bugs-announce] [issue4661] email.parser: impossible to read messages encoded in a different encoding
report at bugs.python.org
Sun Dec 14 17:55:46 CET 2008
New submission from Adeodato Simó <dato at net.com.org.es>:
Currently, email.parser/feedparser can only parse messages that come
as a string, or from a file opened in text mode.
Email messages, however, can contain 8bit characters in any encoding
other than the local one (yet still be valid e-mails, of course), so I
think a method is needed to have the parser be able to receive bytes.
At the moment, and as far as I can see, it's not possible to parse
some perfectly valid messages with python 3.0.
I don't think it's appropriate to ask that files be opened with the
proper encoding, and then for the parser to read them. First, it is
not possible to know what encoding that would be without parsing the
message. And second, a message could contain parts in different
encoding, and many mailboxes with random messages most certainly do.
Also, message objects will need a way to return a bytes repreentation,
for the reasons explained above, and particularly if one wants to
write back the message without modifying it.
components: Library (Lib)
title: email.parser: impossible to read messages encoded in a different encoding
versions: Python 3.0
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce