Parsing an mbox mail file

Oleg Broytmann phd at phd.pp.ru
Sat Jan 27 07:58:21 EST 2001


On Sat, 27 Jan 2001, Sheila King wrote:
> If I use the mailbox module, and use mailboxInstance.next(), it will skip
> right over the message body to the next message's header. The whole reason I'm
> wanting to use the mailbox module, is so that I can easily get to the next
> message in the file, and get it's headers. So, I definitely want to use the
> "next()" command. How can I read the message body in between calls to next?

#! /usr/local/bin/python -O


import sys, os
infile = open(sys.argv[1], 'r')

from mailbox import UnixMailbox
mbox = UnixMailbox(infile)

n = 1
while 1:
   pos = infile.tell()
   from_ = infile.readline() # UnixMailbox ate the field From_ - but I want to preserve it
   infile.seek(pos)

   msg = mbox.next()
   if msg is None: break

   sys.stdout.write("%sProcessing message N%d" % (chr(13), n))
   sys.stdout.flush()
   n = n + 1

   fp = msg.fp
   fp.seek(0) # to the very beginning

   outfile = open("_tmp", 'w')
   outfile.write(from_)
   outfile.write(fp.read()) # write the entire body at once
   outfile.close()

   os.system("%s _tmp >>error.log 2>&1" % sys.argv[2])

infile.close()
print
os.remove("_tmp")

Oleg.
----
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.





More information about the Python-list mailing list