How to manage accented characters in mail header?

Peter J. Holzer hjp-python at hjp.at
Mon Jan 6 14:43:21 EST 2025


On 2025-01-04 19:07:57 +0000, Chris Green via Python-list wrote:
> Stefan Ram <ram at zedat.fu-berlin.de> wrote:
> > Chris Green <cl at isbd.net> wrote or quoted:
> > >From: =?utf-8?B?U8OpYmFzdGllbiBDcmlnbm9u?= <sebastien.crignon at amvs.fr>
> > 
> Is there a simple[r] way to extract just the 'real' address between
> the <>, that's all I actually need.  I think it has the be the last
> chunk of the From: doesn't it?

No,
    From: <sebastien.crignon at amvs.fr> (Sébastien Crignon)
would also be permissible (properly encoded, of course), and even
    From: < sebastien (Sébastien) . crignon (Crignon) @ amvs . fr >
(although I think the latter is deprecated).

And also, there can be more than one address in a From header.

To properly extract email addresses from a header, use
email.utils.getaddresses(). You don't have to decode the header first.
The MIME-encoding is supposed to not interfere with parsing headers for
machine-readable information like addresses or message ids.

        hp

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp at hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://mail.python.org/pipermail/python-list/attachments/20250106/be8e6908/attachment.sig>


More information about the Python-list mailing list