[New-bugs-announce] [issue35427] logging UnicodeDecodeError from undecodable strftime output
report at bugs.python.org
Thu Dec 6 11:39:34 EST 2018
New submission from Mark Dickinson <dickinsm at gmail.com>:
We're seeing UnicodeDecodeErrors on Windows / Python 2.7 when using logging on a Japanese customer machine. The cause turns out to be in the log record formatting, where unicode fields are combined with non-decodable bytestrings coming from strftime.
More details: we were using the following formatter:
_LOG_FORMATTER = logging.Formatter(
In the logging internals, that `datefmt` gets passed to `time.strftime`, which on the machine in question produced a non-ASCII bytestring (i.e., type `str`). When combined with other Unicode strings in the log record, this gave the `UnicodeDecodeError`.
I'm unfortunately failing to reproduce this directly on my own macOS / UK locale machine, but it's documented that `time.strftime` returns a value encoded according to the current locale. In this particular case, the output we were getting from the `time.strftime` call looked like:
which assuming an encoding of cp932 decodes to something plausible:
It looks as though the logging module should be explicitly decoding the strftime output before doing formatting, using for example what's recommended in the strftime documentation :
Code links: this is the line that's producing non-decodable bytes:
... and this is the formatting operation that then ends up raising UnicodeDecodeError as a result of those:
This isn't an issue on Python 3, and I was unable to reproduce it on my non-Windows machine; that particular form of strftime output may well be specific to Windows (or possibly even specific to Japanese flavours of Windows).
components: Library (Lib)
title: logging UnicodeDecodeError from undecodable strftime output
versions: Python 2.7
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce