[docs] [issue21492] email.header.decode_header sometimes returns bytes, sometimes str

Ezio Melotti report at bugs.python.org
Tue Jun 4 00:20:03 EDT 2019


Ezio Melotti <ezio.melotti at gmail.com> added the comment:

If we can't fix the behavior, it should at least be documented.

Currently the docs says "This function returns a list of (decoded_string, charset) pairs containing each of the decoded parts of the header.".  One could assume that this means that a Unicode string is returned, but and as far as I can tell, "decoded_string" means decoded from the format used by the header, not from bytes -- in fact the example below shows a byte string.
#24797 suggest an alternative solution, but there is no indications about it in the docs except an easy-to-miss note about the new API at the top.

Coincidentally as I was reporting this issue I also found the recently opened #37139.  There are also a few other reports: #24797, #37139, #32975, #6302, #4661.

If this method is not actually deprecated, I would document the current behavior (i.e. sometimes it returns bytes, sometimes unicode -- bonus points if there's a simple rule to predict which one), explain that it exists for legacy/backward-compatibility reasons, and point to the alternatives.


FWIW here are 3 more samples that show the inconsistency.

>>> from email.header import decode_header
>>> # str + None
>>> h = '\x80SOKCrGxsbw===== <hello at example.com>'; decode_header(h)
[('\x80SOKCrGxsbw===== <hello at example.com>', None)]
>>> # bytes + '', bytes + None
>>> h = '=??b?SOKCrGxsbw=====?= <hello at example.com>'; decode_header(h)
[(b'H\xe2\x82\xacllo', ''), (b' <hello at example.com>', None)]
>>> # bytes + 'utf8', bytes + None
>>> h = '=?utf8?b?SOKCrGxsbw==?= <hello at example.com>'; decode_header(h)
[(b'H\xe2\x82\xacllo', 'utf8'), (b' <hello at example.com>', None)]

----------
assignee:  -> docs at python
components: +Documentation
nosy: +docs at python, ezio.melotti, louis.abraham at yahoo.fr
resolution: duplicate -> 
stage: resolved -> needs patch
status: closed -> open
type: behavior -> enhancement

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue21492>
_______________________________________


More information about the docs mailing list