Parsing an mbox mail file

Sheila King sheila at spamcop.net
Fri Jan 26 23:25:31 EST 2001


OK, I'm using the mailbox module (see docs here:
http://www.python.org/doc/current/lib/mailbox-objects.html )
to handle a file that has several messages saved in mbox format.

The following script runs:

--------------------------------------------------
import mailbox

infile = open("spam2.txt", "r")
messages = mailbox.UnixMailbox(infile)

while (1):
	currentmssg = messages.next()
	if (currentmssg ==None):
		break
	print currentmssg
--------------------------------------------------

where "spam2.txt" is my mail message file. However, it only prints out the
message headers, which is how I understand rfc822 module to work. I've already
written a few different scripts that use the rfc822 module. Basically, the
rfc822 module seems to handle only the headers, and not the message body.

In other scripts, I've retrieved the message body in this manner:

--------------------------------------------------
#! /usr/bin/python

import rfc822, sys

raw_message=open("message.txt", "r")
inheaders=rfc822.Message(raw_message)
body=raw_message.read()
print inheaders
print
print body

--------------------------------------------------

However, there was only one message in the file. Basically, getting the
message header reads until the first blank line, and then the file pointer is
positioned at the beginning of the body of the message.

If I use the mailbox module, and use mailboxInstance.next(), it will skip
right over the message body to the next message's header. The whole reason I'm
wanting to use the mailbox module, is so that I can easily get to the next
message in the file, and get it's headers. So, I definitely want to use the
"next()" command. How can I read the message body in between calls to next?

--
Sheila King
http://www.thinkspot.net/sheila/
http://www.k12groups.org/




More information about the Python-list mailing list