Distinguishing between maildir, mbox, and MH files/directories?
Cameron Simpson
cs at zip.com.au
Sun Aug 31 22:07:51 EDT 2014
On 31Aug2014 13:45, Tim Chase <python.list at tim.thechases.com> wrote:
>Tinkering around with a little script, I found myself with the need
>to walk a directory tree and process mail messaged found within.
>Sometimes these end up being mbox files (with multiple messages
>within), sometimes it's a Maildir structure with messages in each
>individual file and extra holding directories, and sometimes it's a
>MH directory. To complicate matters, there's also the possibility of
>non-{mbox,maildir,mh) files such as binary MUA caches appearing
>alongside these messages.
>
>Python knows how to handle each just fine as long as I tell it what
>type of file to expect. But is there a straight-forward way to
>distinguish them? (FWIW, the *nix "file" utility is just reporting
>"ASCII text", sometimes "with very long lines", and sometimes
>erroneously flags them as C or C++ files‽).
>
>All I need is "is it maildir, mbox, mh, or something else" (I don't
>have to get more complex for the "something else") inside an os.walk
>loop.
Here is my code for these tests:
def ismhdir(path):
''' Test if `path` points at an MH directory.
'''
return os.path.isfile(os.path.join(path, '.mh_sequences'))
def ismaildir(path):
''' Test if `path` points at a Maildir directory.
'''
for subdir in ('new', 'cur', 'tmp'):
if not os.path.isdir(os.path.join(path,subdir)):
return False
return True
def ismbox(path):
''' Open path and check that its first line begins with "From ".
'''
fp=None
try:
fp=open(path)
from_ = fp.read(5)
except IOError:
if fp is not None:
fp.close()
return False
fp.close()
return from_ == 'From '
I would use these is code somewhat like this (imagining your use case):
if ismaildir(path):
...
elif ismhdir(path):
...
elif ismbox(path):
...
else:
reject other known special files here
continue traversing downward otherwise
Cheers,
Cameron Simpson <cs at zip.com.au>
Gabriel Genellina: See PEP 234 http://www.python.org/dev/peps/pep-0234/
Angus Rodgers:
You've got to love a language whose documentation contains sentences
beginning like this:
"Among its chief virtues are the following four -- no, five -- no,
six -- points: [...]"
from python-list at python.org
More information about the Python-list
mailing list