[Tutor] Splitting email headers when using imaplib
Some Developer
someukdeveloper at gmail.com
Wed Feb 5 16:52:33 CET 2014
On 05/02/2014 09:12, Peter Otten wrote:
> Some Developer wrote:
>
>> I'm currently trying to download emails from an IMAP server using
>> Python. I can download the emails just fine but I'm having an issue when
>> it comes to splitting the relevant headers. Basically I'm using the
>> following to fetch the headers of an email message:
>>
>> typ, msg_header_content = self.m.fetch(msg_id, '(BODY.PEEK[HEADER])')
>>
>> then I can get a string containing the headers by accessing
>> msg_header_content[0][1]. This works fine but I need to split the
>> Subject header, the From header and the To header out into separate
>> strings so I can save the information in a database.
>>
>> I thought the following regular expressions would do the trick when
>> using re.MULTILINE when matching them to the header string but
>> apparently that appears to be wrong.
>>
>> msg_subject_regex = re.compile(r'^Subject:\.+\r\n')
>> msg_from_regex = re.compile(r'^From:\.+\r\n')
>> msg_to_regex = re.compile(r'^To:\.+\r\n')
>>
>> Can anyone point me in the right direction for this please? I'm at a
>> loss here.
> Maybe you can use the email package?
>
>>>> import email
>>>> msg = email.message_from_file(open("tmp.txt"))
>>>> msg["From"]
> 'Some Developer <someukdeveloper at gmail.com>'
>>>> msg["Subject"]
> 'Splitting email headers when using imaplib'
>>>> msg.keys()
> ['Path', 'From', 'Newsgroups', 'Subject', 'Date', 'Lines', 'Approved',
> 'Message-ID', 'NNTP-Posting-Host', 'Mime-Version', 'Content-Type', 'Content-
> Transfer-Encoding', 'X-Trace', 'X-Complaints-To', 'NNTP-Posting-Date', 'To',
> 'Original-X-From', 'Return-path', 'Envelope-to', 'Original-Received',
> 'Original-Received', 'X-Original-To', 'Delivered-To', 'Original-Received',
> 'X-Spam-Status', 'X-Spam-Evidence', 'Original-Received', 'Original-
> Received', 'Original-Received', 'DKIM-Signature', 'X-Received', 'Original-
> Received', 'User-Agent', 'X-Antivirus', 'X-Antivirus-Status', 'X-BeenThere',
> 'X-Mailman-Version', 'Precedence', 'List-Id', 'List-Unsubscribe', 'List-
> Archive', 'List-Post', 'List-Help', 'List-Subscribe', 'Errors-To',
> 'Original-Sender', 'Xref', 'Archived-At']
>
> There is also a message_from_string() function.
>
Awesome. That's exactly what I was looking for. Thanks.
More information about the Tutor
mailing list