[issue1379416] email.Header encode() unicode P2.6

Jan Novak report at bugs.python.org
Mon Mar 23 11:12:26 CET 2009


Jan Novak <xnovakj at users.sourceforge.net> added the comment:

I made some new tests in P2.6.1

>>> import email.charset

>>> c=email.charset.Charset('utf-8')
>>> print c.input_charset, type(c.input_charset)
utf-8 <type 'unicode'>
>>> print c.output_charset, type(c.output_charset)
utf-8 <type 'str'>

but

>>> c=email.charset.Charset('iso-8859-2')
>>> print c.input_charset, type(c.input_charset)
iso-8859-2 <type 'unicode'>
>>> print c.output_charset, type(c.output_charset)
iso-8859-2 <type 'unicode'>

but if you use alias latin-2 it's OK

>>> c=email.charset.Charset('latin-2')
>>> print c.input_charset, type(c.input_charset)
iso-8859-2 <type 'str'>
>>> print c.output_charset, type(c.output_charset)
iso-8859-2 <type 'str'>
>>> 

Error is here for unicode input-charset:
self.input_charset->conv->self.output_charset

module email/charset.py line 219

        if not conv:
            conv = self.input_charset

for the charsets where aren't output conversions

CHARSETS = {
    # input        header enc  body enc output conv
    'iso-8859-1':  (QP,        QP,      None),
    'iso-8859-2':  (QP,        QP,      None),

and if you don't use alias

ALIASES = {
    'latin_1': 'iso-8859-1',
    'latin-1': 'iso-8859-1',
    'latin_2': 'iso-8859-2',
    'latin-2': 'iso-8859-2',

But the realy source of this error is on line 208
 input_charset = unicode(input_charset, 'ascii')

because this construction returns unicode

>>> print type(unicode('iso-8859-2','ascii'))
<type 'unicode'>

----------
title: email.Header encode() unicode P2.3xP2.4 -> email.Header encode() unicode P2.6

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1379416>
_______________________________________


More information about the Python-bugs-list mailing list