Re: Internet Data Handling » mailbox
Adam Jensen
hanzer at riseup.net
Sat Oct 22 19:49:29 EDT 2016
On 10/22/2016 03:24 AM, dieter wrote:
> In addition to the previous (excellent) responses:
>
> A "message" models a MIME (RFC1521 Multipurpose Internet Mail Extensions)
> message (the international standard for the structure of emails).
> The standard tells you that a message consists essentially of two
> parts: a set of headers and a body and describes standard headers
> and their intended meaning (e.g. "To", "From", "Subject", ...).
> It allows a message to contain non-standard headers as well.
>
> With this knowledge, your "keys" related question can be answered:
> there is a (case insensitive) key for each header actually present
> in your message. If the message contains several headers with
> the same name, the subscription access gives you the first one;
> there is an alternative method to access all of them.
Thanks. I needed to search for emails to/from a specific person and
extract them from a [Google mail archive][1].
[1]: https://takeout.google.com/settings/takeout
This is my quick and dirty little one-shot script to get the job done.
search_mbox.py
--------------------------------------------------------------
#!/usr/bin/env python2.7
import mailbox
import sys
name = sys.argv[2].lower()
for message in mailbox.mbox(sys.argv[1]):
if message.has_key("From") and message.has_key("To"):
addrs = message.get_all("From")
addrs.extend(message.get_all("To"))
for addr in addrs:
addrl = addr.lower()
if addrl.find(name) > 0:
print message
break
--------------------------------------------------------------
Usage: ./search_mbox.py archive.mbox hanzer > hanzer.mbox
More information about the Python-list
mailing list