logging of strings with broken encoding
Lie Ryan
lie.1296 at gmail.com
Thu Jul 2 13:14:34 EDT 2009
Thomas Guettler wrote:
> My quick fix is this:
>
> class MyFormatter(logging.Formatter):
> def format(self, record):
> msg=logging.Formatter.format(self, record)
> if isinstance(msg, str):
> msg=msg.decode('utf8', 'replace')
> return msg
>
> But I still think handling of non-ascii byte strings should be better.
> A broken logging message is better than none.
>
The problem is, python 2.x assumed the default encoding of `ascii`
whenever you don't explicitly mention the encoding, and your code
apparently broke with that assumption. I haven't looked at your code,
but others have suggested that you've fed the logging module with
non-ascii byte strings. The logging module can only work with 1) unicode
string, 2) ascii-encoded byte string
If you want a quick fix, you may be able to get away with repr()-ing
your log texts. A proper fix, however, is to pass a unicode string to
the logging module instead.
>>> logging.warn('ы') # or logging.warn('\xd1\x8b')
Traceback (most recent call last):
File "/usr/lib64/python2.6/logging/__init__.py", line 773, in emit
stream.write(fs % msg.encode("UTF-8"))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd1 in position 13:
ordinal not in range(128)
>>> logging.warn(repr('ы'))
WARNING:root:'\xd1\x8b'
>>> logging.warn(u'ы')
WARNING:root:ы
More information about the Python-list
mailing list