Reading Outlook .msg file using Python

John Henry john106henry at hotmail.com
Wed Oct 20 13:13:16 EDT 2010


On Oct 20, 9:01 am, John Henry <john106he... at hotmail.com> wrote:
> On Oct 20, 1:41 am, Tim Golden <m... at timgolden.me.uk> wrote:
>
>
>
> > On 19/10/2010 22:48, John Henry wrote:
>
> > > Looks like this flag is valid only if you are getting messages
> > > directly from Outlook.  When reading the msg file, the flag is
> > > invalid.
>
> > > Same issue when accessing attachments.  In addition, the MAPITable
> > > method does not seem to work at all when trying to get attachments out
> > > of the msg file (works when dealing with message in an Outlook
> > > mailbox).  Eitherway, the display_name doesn't work when trying to
> > > display the filename of the attachment.
>
> > > I was able to get the date by using the PR_TRANSPORT_MESSAGE_HEADERS
> > > mapitags
>
> > Ah, thanks. As you will have realised, my code is basically geared
> > to reading an Outlook/Exchange message box. I hadn't really tried
> > it on individual message files, except my original excerpt. If it
> > were opportune, I'd be interested in seeing your working code.
>
> > TJG
>
> When (and if) I finally figure out how to get it done, I surely will
> make the code available.  It's pretty close.  All I need is to figure
> out how to extract the attachments.
>
> Too bad I don't know (and don't have) C#.  This guy did it so cleanly:
>
> http://www.codeproject.com/KB/office/reading_an_outlook_msg.aspx?msg=...
>
> May be somebody that knows both C# and Python can convert the code
> (not much code) and then the Python community will have it.  As it
> stands, it seems the solution is available in Java, C#, VB .... but
> not Python.

BTW: For the benefit of future search on this topic, with the code
listed above where:

storage_flags = STGM_DIRECT | STGM_READ | STGM_SHARE_EXCLUSIVE

I had to change it to:

storage_flags = STGM_DIRECT | STGM_READ | STGM_SHARE_DENY_NONE |
STGM_TRANSACTED

otherwise I get a sharing violation (see
http://efreedom.com/Question/1-1086814/Opening-OLE-Compound-Documents-Read-StgOpenStorage).

For now, I am using a brute force method (http://mail.python.org/
pipermail/python-win32/2009-February/008825.html) to get the names of
the attachments and if I need to extract the attachments, I pop up the
message in Outlook and let Outlook extract the files.  Ugly but fits
my client's need for now.  Hopefully there will be a cleaner solution
down the road.

Here's my code for brute forcing attachments out of the msg file (very
ugly):

	def get_attachments(self, fileID):
		#from win32com.storagecon import *
		from win32com import storagecon
		import pythoncom

		flags = storagecon.STGM_READ | storagecon.STGM_SHARE_DENY_NONE |
storagecon.STGM_TRANSACTED
		try:
			storage = pythoncom.StgOpenStorage (fileID, None, flags)
		except:
			return []

		flags = storagecon.STGM_READ | storagecon.STGM_SHARE_EXCLUSIVE
		attachments=[]
		for data in storage.EnumElements ():
			print data[0], data[1]
			if data[1] == 2 or data[0] == "__substg1.0_007D001F":
				stream = storage.OpenStream (data[0], None, flags)
				try:
					msg = stream.Read (data[2])
				except:
					pass
				else:
					msg = repr (msg).replace("\
\x00","").strip("'").replace("%23","#")
					if data[0] == "__substg1.0_007D001F":
						try:
							attachments.append(msg.split("name=\"")[1].split("\"")[0])
						except:
							pass

		return attachments




More information about the Python-list mailing list