What do these '=?utf-8?' sequences mean in python?
Chris Green
cl at isbd.net
Mon May 8 04:45:43 EDT 2023
Keith Thompson <Keith.S.Thompson+u at gmail.com> wrote:
> Chris Green <cl at isbd.net> writes:
> > Chris Green <cl at isbd.net> wrote:
> >> I'm having a real hard time trying to do anything to a string (?)
> >> returned by mailbox.MaildirMessage.get().
> >>
> > What a twit I am :-)
> >
> > Strings are immutable, I have to do:-
> >
> > newstring = oldstring.replace("_", " ")
> >
> > Job done!
>
> Not necessarily.
>
> The subject in the original article was:
> =?utf-8?Q?aka_Marne_=C3=A0_la_Sa=C3=B4ne_(Waterways_Continental_Europe)?=
>
> That's some kind of MIME encoding. Just replacing underscores by spaces
> won't necessarily give you anything meaningful. (What if there are
> actual underscores in the original subject line?)
>
> You should probably apply some kind of MIME-specific decoding. (I don't
> have a specific suggestion for how to do that.)
>
Yes, OK, but my problem was that my filter looks for the string
"Waterways Continental Europe" in the message Subject: to route the
message to the appropriate mailbox. When the Subject: has accents the
string becomes "Waterways_Continental_Europe" and thus the match
fails. Simply changing all underscores back to spaces makes my test
for "Waterways Continental Europe" work. The changed Subject: line
gets thrown away after the test so I don't care about anything else
getting changed.
(When there are no accented characters in the Subject: the string is
"Waterways Continental Europe" so I can't easily change the search
text. I guess I could use an RE.)
--
Chris Green
ยท
More information about the Python-list
mailing list