[issue14452] SysLogHandler sends invalid messages when using unicode

Vinay Sajip report at bugs.python.org
Thu Apr 5 16:53:52 CEST 2012


Vinay Sajip <vinay_sajip at yahoo.co.uk> added the comment:

Ok, I see what the problem is. I could go for option 1 - leave the BOM out, encode the string as UTF-8 but send it as just a bunch of bytes, i.e. the MSG-ANY variant of the spec. However, this could break any existing code that doesn't use structured data before the message, as you are doing, and relies on the MSG-UTF8 variant. So while agreeing with you that the situation isn't ideal, I don't see how I can change things while preserving backward compatibility, other than:

Introduce a class-level _insert_BOM attribute, defaulting to True but which can be set to False on a per-instance level. The BOM would only be inserted if this were True, so the current behaviour is preserved, but you could set the attribute to False and leave the BOM out. However, isn't ideal either, as it requires you to be sensitive to the exact version of Python in use - not easy if you're developing a library rather than an application.

I will consider a different approach in 3.3, e.g. provide one or more overridable methods to construct the message sent across the socket.

Thoughts? If these methods won't suffice, you can always resort to subclassing the handler and overriding the emit method (not that that's ideal, either).

----------
resolution: invalid -> 

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14452>
_______________________________________


More information about the Python-bugs-list mailing list