parsing email from stdin
Antoon Pardon
antoon.pardon at rece.vub.ac.be
Tue Oct 8 08:20:20 EDT 2013
I want to do some postprocessing on messages from a particular mailbox.
So I use getmail which will fetch the messages and feed them to stdin
of my program.
As I don't know what encoding these messages will be in, I thought it
would be prudent to read stdin as binary data.
Using python 3.3 on a debian box I have the following code.
#!/usr/bin/python3
import sys
from email import message_from_file
sys.stdin = sys.stdin.detach()
msg = message_from_file(sys.stdin)
which gives me the following trace back
File "/home/apardon/.getmail/verdeler", line 7, in <module>
msg = message_from_file(sys.stdin)
File "/usr/lib/python3.3/email/__init__.py", line 56, in message_from_file
return Parser(*args, **kws).parse(fp)
File "/usr/lib/python3.3/email/parser.py", line 58, in parse
feedparser.feed(data)
File "/usr/lib/python3.3/email/feedparser.py", line 167, in feed
self._input.push(data)
File "/usr/lib/python3.3/email/feedparser.py", line 100, in push
data, self._partial = self._partial + data, ''
TypeError: Can't convert 'bytes' object to str implicitly))
which seems to be rather odd. The following header are in the msg:
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
So why doesn't the email parser lookup the charset and use that
for converting to string type?
What is the canonical way to parse an email message from stdin?
--
Antoon Pardon
More information about the Python-list
mailing list