Problem with accented characters in mailbox.Maildir()
jak
nospam at please.ty
Sat May 6 10:27:04 EDT 2023
Chris Green ha scritto:
> Chris Green <cl at isbd.net> wrote:
>> A bit more information, msg.get("subject", "unknown") does return a
>> string, as follows:-
>>
>> Subject: =?utf-8?Q?aka_Marne_=C3=A0_la_Sa=C3=B4ne_(Waterways_Continental_Europe)?=
>>
>> So it's the 'searchTxt in msg.get("subject", "unknown")' that's
>> failing. I.e. for some reason 'in' isn't working when the searched
>> string has utf-8 characters.
>>
>> Surely there's a way to handle this.
>>
> ... and of course I now see the issue! The Subject: with utf-8
> characters in it gets spaces changed to underscores. So searching for
> '(Waterways Continental Europe)' fails.
>
> I'll either need to test for both versions of the string or I'll need
> to change underscores to spaces in the Subject: returned by msg.get().
> It's a long enough string that I'm searching for that I won't get any
> false positives.
>
>
> Sorry for the noise everyone, it's a typical case of explaining the
> problem shows one how to fix it! :-)
>
This is probably what you need:
import email.header
raw_subj =
'=?utf-8?Q?aka_Marne_=C3=A0_la_Sa=C3=B4ne_(Waterways_Continental_Europe)?='
subj = email.header.decode_header(raw_subj)[0]
subj[0].decode(subj[1])
'aka Marne à la Saône (Waterways Continental Europe)'
More information about the Python-list
mailing list