How to manage accented characters in mail header?
Peter J. Holzer
hjp-python at hjp.at
Mon Jan 6 14:43:21 EST 2025
On 2025-01-04 19:07:57 +0000, Chris Green via Python-list wrote:
> Stefan Ram <ram at zedat.fu-berlin.de> wrote:
> > Chris Green <cl at isbd.net> wrote or quoted:
> > >From: =?utf-8?B?U8OpYmFzdGllbiBDcmlnbm9u?= <sebastien.crignon at amvs.fr>
> >
> Is there a simple[r] way to extract just the 'real' address between
> the <>, that's all I actually need. I think it has the be the last
> chunk of the From: doesn't it?
No,
From: <sebastien.crignon at amvs.fr> (Sébastien Crignon)
would also be permissible (properly encoded, of course), and even
From: < sebastien (Sébastien) . crignon (Crignon) @ amvs . fr >
(although I think the latter is deprecated).
And also, there can be more than one address in a From header.
To properly extract email addresses from a header, use
email.utils.getaddresses(). You don't have to decode the header first.
The MIME-encoding is supposed to not interfere with parsing headers for
machine-readable information like addresses or message ids.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp at hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://mail.python.org/pipermail/python-list/attachments/20250106/be8e6908/attachment.sig>
More information about the Python-list
mailing list