Finding messages in huge mboxes
haasje at welmers.net
Mon Feb 2 21:37:05 CET 2004
I wondered if anyone has ever met this same mbox issue.
I'm having the following problem:
I need find messages in huge mbox files (50MB or more).
The following way is (of course?) not very usable:
fp = open("mbox", "r")
archive = mailbox.UnixMailbox(fp)
while i < message_number_needed:
needed_message = archive.next()
Especially because I often need messages at the end
of the MBOX file.
So I tried the following (scanning messages backwards
on found "From " lines with readline())
line = fp.readline()
if not line:
if line[:5] == 'From ':
if j == total_messages - message_number_needed:
archive.seekp = fp.tell()
message = archive.next()
# message found
But also seems to be slow and CPU consuming.
Anyone who has a better idea?
More information about the Python-list