
Vinay Sajip wrote:
You should use sys.getdefaultencoding() for this. Python's default encoding is "ascii", BTW, not "latin-1".
OK, but supposing I log a message with the format string u"Marc-Andr\xe9". If I use "ascii" as the encoding, an exception is raised because the code for e-acute is > 128. How best should this situation be handled? I assumed that people would be using Unicode to log messages in other languages with accented characters etc. which are typically outside the ASCII range.
I don't know how you have implemented logging to files or stream, but in general it's better to let the stream get the Unicode and let it decide what to do, e.g. you could have a log file open in 'latin-1' mode (via codecs.open()) and then see log entries as Latin-1. To make things a little more robust (ie. to prevent the log message from getting lost), I'd suggest to use try: except UnicodeError: around the log writing code. The except clause should then encode the log message to UTF-8 and write the 8-bit string in a second try. -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Feb 16 2003)
Python/Zope Products & Consulting ... http://www.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
Python UK 2003, Oxford: 44 days left EuroPython 2003, Charleroi, Belgium: 128 days left