[python-win32] Bug When Reading In PSTs

Tim Roberts timr at probo.com
Thu Feb 18 02:12:14 EST 2021


On Feb 17, 2021, at 6:07 AM, Nick Orr <nick.p.orr.spam at gmail.com> wrote:
> 
> I've been developing a Python tool to ingest and write all emails from a PST exported from Outlook to individual .html files. The issue is that when opening the PST in outlook and checking the source information for emails individually, it includes this specific line:
> 
> <meta http-equiv=Content-Type content="text/html; charset=utf-8">
> 
> which IS NOT being included when importing the PST with Pywin32 and reading all the emails in the PST. To see what it looks like in a chunk - 

What you HAVEN’T said here is how you are talking to Outlook — how you generated your “outlook” object.  PyWin32 doesn’t have any code that is Outlook-specific.  I assume you’re using win32com.client.Dispatch.  If so, remember that Python isn’t doing any processing here.  It’s just passing requests through COM to Outlook.  If your text is coming back oddly, then Outlook is returning it oddly.

<meta>  tags are intended for web servers; it’s possible that Outlook is absorbing the <meta> tag because it isn’t useful.  Maybe it has copied the charset to a property of the message object to reflect the character set.  That is, perhaps there’s something in the “item” object that gets tweaked.


> Because the emails otherwise are identical, I can only assume this is being done by the library. I'm wondering if there's a reason that meta tag is excluded, or if it's a bug in PyWin32?

No, it’s not being done by PyWin32.  It’s being done by Outlook.  You’d get the same result if you called this method from C#.
— 
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.



More information about the python-win32 mailing list