Decoding an attachment

Terry Reedy tjreedy at udel.edu
Thu Jul 31 23:42:00 EDT 2008



robinsiebler at gmail.com wrote:
> I figured out how to save an e-mail message as a text file, but I'm
> not sure how to decode the encoded part as I am not sure how much I
> need to include to decode it properly.  Here is what a message looks
> like:
The email.parser and email.message modules will do this for you.
But understanding the parts will help you use them properly.
I am 99% sure of the following comments.
> 
> 
> Received: from INGESTOR2SQA ([10.220.83.198]) by sqaserver300 with
> Microsoft SMTPSVC(6.0.3790.0);
> 	 Thu, 31 Jul 2008 12:10:26 -0700
> mime-version: 1.0
> from: "AVDN Ingestor" <sqatest at ictvsys.pvt>
> to: sqatest at ictvsys.pvt
> date: 31 Jul 2008 17:53:40 +0000
> subject: Upload Status for test
> content-type: multipart/mixed;
This says multiple mime parts of mixed types

  boundary=--boundary_2_09ab8836-
> ff06-41a6-94d6-59258539bf88
This is the boundary between parts
> Return-Path: sqatest at ictvsys.pvt
> Message-ID: <SQASERVER300wQtcC6Z0000000e at sqaserver300>
> X-OriginalArrivalTime: 31 Jul 2008 19:10:26.0125 (UTC)
> FILETIME=[16D08FD0:01C8F341]
> 
> 
blank lines separate header from body
> ----boundary_2_09ab8836-ff06-41a6-94d6-59258539bf88
This is the boundary specified above, with '--' prepended
> content-type: text/plain; charset=us-ascii
> content-transfer-encoding: quoted-printable
This is header of first part
> 
Blank separates part header from part boundary
> For additional information, please see attachment.
This is first (and last) list of first, plain text part
> ----boundary_2_09ab8836-ff06-41a6-94d6-59258539bf88
This is boundary ending first part and starting second part
> content-type: text/xml; name=status.xml; charset=utf-8
> content-transfer-encoding: base64
> content-disposition: attachment
This is 2nd part header, specifying base64 coding, which is one of the 
two that email.message knows about.
> 
Blank separating header and payload
> PGF2ZG5fcG5tIHZlcnNpb249IjEuMSIgc2VuZGVyPSJhdmRud2VzdGRjMSIgZG9jbnVtYmVy
> PSJvaHV0bHBxeXptM25xd3lrYm9wd3pwdTUiIHN0YXR1cz0ic3VjY2Vzc2Z1bCI+PGNvbnRh
> Y3QgcGhvbmU9IjQwOC05MzEtOTIzMiIgZW1haWw9Im0uY2hhbkBhdm5ldHdvcmtzLmNvbSIg
> Lz48YXNzZXRzdGF0ZSBwcm92aWRlcmlkPSJTUUEiIGFzc2V0aWQ9ImFjZjFlM2Q2ZWJjZjcw
> NDcwNWFjIiBmaWxlbmFtZT0iXHZpZGVvXGZvb1xib3VnZTAwMy4zMjBfMjQwX250c2NfYWMz
> XzAxLmpwZyIgbm90aWZpY2F0aW9uPSIxMjAiIC8+PGFzc2V0c3RhdGUgcHJvdmlkZXJpZD0i
> U1FBIiBhc3NldGlkPSJhY2YxZTNkNmViY2Y3MDQ3MDVhYyIgZmlsZW5hbWU9Ilx2aWRlb1xm
> b29cYm91Z2UwMDMuMzIwXzI0MF9udHNjX2FjM18wMy5qcGciIG5vdGlmaWNhdGlvbj0iMTIw
> IiAvPjxhc3NldHN0YXRlIHByb3ZpZGVyaWQ9IlNRQSIgYXNzZXRpZD0iYWNmMWUzZDZlYmNm
> NzA0NzA1YWMiIGZpbGVuYW1lPSJcdmlkZW9cZm9vXGJvdWdlMDAzLjMyMF8yNDBfbnRzY19h
> YzNfMDQuanBnIiBub3RpZmljYXRpb249IjEyMCIgLz48YXNzZXRzdGF0ZSBwcm92aWRlcmlk
> PSJTUUEiIGFzc2V0aWQ9ImFjZjFlM2Q2ZWJjZjcwNDcwNWFjIiBmaWxlbmFtZT0iXHZpZGVv
> XGZvb1xib3VnZTAwMy4zMjBfMjQwX250c2NfYWMzXzAyLmpwZyIgbm90aWZpY2F0aW9uPSIx
> MjAiIC8+PGFzc2V0c3RhdGUgcHJvdmlkZXJpZD0iU1FBIiBhc3NldGlkPSJhY2YxZTNkNmVi
> Y2Y3MDQ3MDVhYyIgZmlsZW5hbWU9Ilx2aWRlb1xmb29cYm91Z2UwMDMuMzIwXzI0MF9udHNj
> X2FjMy5qcGciIG5vdGlmaWNhdGlvbj0iMTIwIiAvPjxhc3NldHN0YXRlIHByb3ZpZGVyaWQ9
> IlNRQSIgYXNzZXRpZD0iYWNmMWUzZDZlYmNmNzA0NzA1YWMiIGZpbGVuYW1lPSJcdmlkZW9c
> Zm9vXGJvdWdlMDAzLjMyMF8yNDBfbnRzY19hYzNfMDguanBnIiBub3RpZmljYXRpb249IjEy
> MCIgLz48YXNzZXRzdGF0ZSBwcm92aWRlcmlkPSJTUUEiIGFzc2V0aWQ9ImFjZjFlM2Q2ZWJj
> ZjcwNDcwNWFjIiBmaWxlbmFtZT0iXHZpZGVvXGZvb1xib3VnZTAwMy4zMjBfMjQwX250c2Nf
> YWMzXzA3LmpwZyIgbm90aWZpY2F0aW9uPSIxMjAiIC8+PGFzc2V0c3RhdGUgcHJvdmlkZXJp
> ZD0iU1FBIiBhc3NldGlkPSJhY2YxZTNkNmViY2Y3MDQ3MDVhYyIgZmlsZW5hbWU9Ilx2aWRl
> b1xmb29cYm91Z2UwMDMuMzIwXzI0MF9udHNjX2FjM18wNi5qcGciIG5vdGlmaWNhdGlvbj0i
> MTIwIiAvPjxhc3NldHN0YXRlIHByb3ZpZGVyaWQ9IlNRQSIgYXNzZXRpZD0iYWNmMWUzZDZl
> YmNmNzA0NzA1YWMiIGZpbGVuYW1lPSJcdmlkZW9cZm9vXGJvdWdlMDAzLjMyMF8yNDBfbnRz
> Y19hYzNfMDUuanBnIiBub3RpZmljYXRpb249IjEyMCIgLz48YXNzZXRzdGF0ZSBwcm92aWRl
> cmlkPSJTUUEiIGFzc2V0aWQ9ImFjZjFlM2Q2ZWJjZjcwNDcwNWFjIiBmaWxlbmFtZT0iXHZp
> ZGVvXGZvb1xib3VnZTAwMy4zMjBfMjQwX250c2NfYWMzXzA5LmpwZyIgbm90aWZpY2F0aW9u
> PSIxMjAiIC8+PGFzc2V0c3RhdGUgcHJvdmlkZXJpZD0iU1FBIiBhc3NldGlkPSJhY2YxZTNk
> NmViY2Y3MDQ3MDVhYyIgZmlsZW5hbWU9Ilx2aWRlb1xmb29cYm91Z2UwMDMuMzIwXzI0MF9u
> dHNjX2FjMy50cyIgbm90aWZpY2F0aW9uPSIxMjAiIC8+PC9hdmRuX3BubT4=
end of payload, lines in between go to base64 module
> ----boundary_2_09ab8836-ff06-41a6-94d6-59258539bf88--
End of part 2 and mime body
#I believe
s = message_as_string
m = email.parser.message_from_string(s) # email.message.Message object
p = m.get_payload(1,True)
# should give you the 2nd part as a message with the body base64-decoded
q = p.get_payload() # utf-8 encoded xml string
# I recommend trying out other Message methods
# In particular, you want to be able to retrieve the subfields of 
content_type and I don't know exactly how.

Terry Jan Reedy




More information about the Python-list mailing list