[Twisted-Python] print unicode

Hello! I'm using Twisted 10.0 and as usually sometime print debug infos with myunicodestr.encode('UTF-8') which are saved to logfile, but since using twisted 10 I'm getting UnicodeEncodeError: 'ascii' codec can't encode characters... type(myunicodestr) returns <type 'unicode'> What is the problem here? Thanks!

On Wednesday 05 May 2010, Pet wrote:
UTF-8 uses the full 8 bits of a byte, while ASCII only uses 7, so writing Unicode encoded as UTF-8 to an ASCII stream is not valid. I think recent Python versions are more strict about what is written to stdout/stderr than older versions, it might not be related to Twisted itself. You can specify a different encoding for stdin/out/err by setting the PYTHONIOENCODING environment variable. Bye, Maarten

On 05/05/10 13:31, Pet wrote:
I think this is highly dependent on your OS environment. For example: Python 2.4.3 (#1, Oct 23 2006, 14:19:47) [GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] o Type "help", "copyright", "credits" or "
[pjm3@wildfire ~]$ echo $LANG en_GB.UTF-8 ...but: LANG=C python Python 2.4.3 (#1, Oct 23 2006, 14:19:47) [GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. position 0: ordinal not in range(128) ...i.e. here I can just print unicode characters, with nothing particularly special, provided my environment variables are set right.

On Wed, 2010-05-05 at 13:45 +0200, Pet wrote:
This works fine for me (Twisted trunk): $ python2.5 -c "import sys; from twisted.python import log; \ log.startLogging(file('/tmp/log', 'w')); print \ u'\u1234'.encode('UTF-8')" $ cat /tmp/log 2010-05-05 08:48:40-0400 [-] Log opened. 2010-05-05 08:48:40-0400 [-] ሴ Can you include a minimal reproducing example?

On Wed, May 5, 2010 at 4:15 PM, Pet <petshmidt@googlemail.com> wrote:
It's pretty weird. I've send as parameter {'s': u'c\u0142a'} to twisted xml-rpc server after it was restarted and it has printed param['s'].encode('UTF-8') without errors. Immidiately after that I've send the same request again and it failed to print it. I've restarted the server again and at the first request it prints without errors, all other requests raise exceptions. So it has nothing to do with database.

On Wed, May 5, 2010 at 4:29 PM, Pet <petshmidt@googlemail.com> wrote:
Now, I'm getting Exception with File "/usr/local/tw10/lib/python2.5/site-packages/Twisted-10.0.0-py2.5-linux-x86_64.egg/twisted/python/log.py", line 555, in write d = (self.buf + data).split('\n') exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 4: ordinal not in range(128)

On Wednesday 05 May 2010, Pet wrote:
UTF-8 uses the full 8 bits of a byte, while ASCII only uses 7, so writing Unicode encoded as UTF-8 to an ASCII stream is not valid. I think recent Python versions are more strict about what is written to stdout/stderr than older versions, it might not be related to Twisted itself. You can specify a different encoding for stdin/out/err by setting the PYTHONIOENCODING environment variable. Bye, Maarten

On 05/05/10 13:31, Pet wrote:
I think this is highly dependent on your OS environment. For example: Python 2.4.3 (#1, Oct 23 2006, 14:19:47) [GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] o Type "help", "copyright", "credits" or "
[pjm3@wildfire ~]$ echo $LANG en_GB.UTF-8 ...but: LANG=C python Python 2.4.3 (#1, Oct 23 2006, 14:19:47) [GCC 4.1.1 20060525 (Red Hat 4.1.1-1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. position 0: ordinal not in range(128) ...i.e. here I can just print unicode characters, with nothing particularly special, provided my environment variables are set right.

On Wed, 2010-05-05 at 13:45 +0200, Pet wrote:
This works fine for me (Twisted trunk): $ python2.5 -c "import sys; from twisted.python import log; \ log.startLogging(file('/tmp/log', 'w')); print \ u'\u1234'.encode('UTF-8')" $ cat /tmp/log 2010-05-05 08:48:40-0400 [-] Log opened. 2010-05-05 08:48:40-0400 [-] ሴ Can you include a minimal reproducing example?

On Wed, May 5, 2010 at 4:15 PM, Pet <petshmidt@googlemail.com> wrote:
It's pretty weird. I've send as parameter {'s': u'c\u0142a'} to twisted xml-rpc server after it was restarted and it has printed param['s'].encode('UTF-8') without errors. Immidiately after that I've send the same request again and it failed to print it. I've restarted the server again and at the first request it prints without errors, all other requests raise exceptions. So it has nothing to do with database.

On Wed, May 5, 2010 at 4:29 PM, Pet <petshmidt@googlemail.com> wrote:
Now, I'm getting Exception with File "/usr/local/tw10/lib/python2.5/site-packages/Twisted-10.0.0-py2.5-linux-x86_64.egg/twisted/python/log.py", line 555, in write d = (self.buf + data).split('\n') exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 4: ordinal not in range(128)
participants (4)
-
Itamar Turner-Trauring
-
Maarten ter Huurne
-
Pet
-
Phil Mayers