Parsing email attachments: get_payload() produces unsaveable data

dpapathanasiou denis.papathanasiou at gmail.com
Sun Oct 4 10:27:36 EDT 2009


I'm using python to access an email account via POP, then for each
incoming message, save any attachments.

This is the function which scans the message for attachments:

def save_attachments (local_folder, msg_text):
    """Scan the email message text and save the attachments (if any)
in the local_folder"""
    if msg_text:
        for part in email.message_from_string(msg_text).walk():
            if part.is_multipart() or part.get_content_maintype() ==
'text':
                continue
            filename = part.get_filename(None)
            if filename:
                filedata = part.get_payload(decode=True)
                if filedata:
                    write_file(local_folder, filename, filedata)

All the way up to write_file(), it's working correctly.

The filename variable matches the name of the attached file, and the
filedata variable contains binary data corresponding to the file's
contents.

When I try to write the filedata to a file system folder, though, I
get an AttributeError in the stack trace.

Here is my write_file() function:

def write_file (folder, filename, f, chunk_size=4096):
    """Write the the file data f to the folder and filename
combination"""
    result = False
    if confirm_folder(folder):
        try:
            file_obj = open(os.path.join(folder, file_base_name
(filename)), 'wb', chunk_size)
            for file_chunk in read_buffer(f, chunk_size):
                file_obj.write(file_chunk)
            file_obj.close()
            result = True
        except (IOError):
            print "file_utils.write_file: could not write '%s' to
'%s'" % (file_base_name(filename), folder)
    return result

I also tried applying this regex:

filedata = re.sub(r'\r(?!=\n)', '\r\n', filedata) # Bare \r becomes \r
\n

after reading this post (http://stackoverflow.com/questions/787739/
python-email-getpayload-decode-fails-when-hitting-equal-sign), but it
hasn't resolved the problem.

Is there any way of correcting the output of get_payload() so I can
save it to a file?



More information about the Python-list mailing list