mailbox misbehavior with non-ASCII
Peter Pearson
pkpearson at nowhere.invalid
Fri Jul 29 19:24:57 EDT 2022
The following code produces a nonsense result with the input
described below:
import mailbox
box = mailbox.Maildir("/home/peter/Temp/temp",create=False)
x = box.values()[0]
h = x.get("X-DSPAM-Factors")
print(type(h))
# <class 'email.header.Header'>
The output is the desired "str" when the message file contains this:
To: recipient at example.com
Message-ID: <123>
Date: Sun, 24 Jul 2022 15:31:19 +0000
Subject: Blah blah
From: from at from.com
X-DSPAM-Factors: a'b
xxx
... but if the apostrophe in "a'b" is replaced with a
RIGHT SINGLE QUOTATION MARK, the returned h is of type
"email.header.Header", and seems to contain inscrutable garbage.
I realize that one should not put non-ASCII characters in
message headers, but of course I didn't put it there, it
just showed up, pretty much beyond my control. And I realize
that when software is given input that breaks the rules, one
cannot expect optimal results, but I'd think an exception
would be the right answer.
Is this worth a bug report?
--
To email me, substitute nowhere->runbox, invalid->com.
More information about the Python-list
mailing list