Another 2 to 3 mail encoding problem
Chris Green
cl at isbd.net
Thu Aug 27 04:31:44 EDT 2020
Terry Reedy <tjreedy at udel.edu> wrote:
> On 8/26/2020 11:10 AM, Chris Green wrote:
>
> > I have a simple[ish] local mbox mail delivery module as follows:-
> ...
> > It has run faultlessly for many years under Python 2. I've now
> > changed the calling program to Python 3 and while it handles most
> > E-Mail OK I have just got the following error:-
> >
> > Traceback (most recent call last):
> > File "/home/chris/.mutt/bin/filter.py", line 102, in <module>
> > mailLib.deliverMboxMsg(dest, msg, log)
> ...
> > File "/usr/lib/python3.8/email/generator.py", line 406, in write
> > self._fp.write(s.encode('ascii', 'surrogateescape'))
> > UnicodeEncodeError: 'ascii' codec can't encode character '\ufeff' in
> position 4: ordinal not in range(128)
>
> '\ufeff' is the Unicode byte-order mark. It should not be present in an
> ascii-only 3.x string and would not normally be present in general
> unicode except in messages like this that talk about it. Read about it,
> for instance, at
> https://en.wikipedia.org/wiki/Byte_order_mark
>
> I would catch the error and print part or all of string s to see what is
> going on with this particular message. Does it have other non-ascii chars?
>
I can provoke the error simply by sending myself an E-Mail with
accented characters in it. I'm pretty sure my Linux system is set up
correctly for UTF8 characters, I certainly seem to be able to send and
receive these to others and I even get to see messages in other
scripts such as arabic, chinese, etc.
The code above works perfectly in Python 2 delivering messages with
accented (and other extended) characters with no problems at all.
Sending myself E-Mails with accented characters works OK with the code
running under Python 2.
While an E-Mail body possibly *shouldn't* have non-ASCII characters in
it one must be able to handle them without errors. In fact haven't
the RFCs changed such that the message body should be 8-bit clean?
Anyway I think the Python 3 mail handling libraries need to be able to
pass extended characters through without errors.
--
Chris Green
ยท
More information about the Python-list
mailing list