(Fucking) Unicode: console print statement and PythonWin: replacement for off-table chars HOWTO?

Robert kxroberto at googlemail.com
Wed Jan 11 09:21:56 EST 2006


Neil Hodgson schrieb:
> Robert:
>     PythonWin did have some Unicode support but I think Mark Hammond was
> discouraged by bugs. In pythonwin/__init__.py there is a setting
> is_platform_unicode = 0 with a commented out real test for Unicode on
> the next line. Change this to 1 and restart and you may see
>
>  >>> x = u'sytest3\\\u041f\u043e\u0448\u0443\u043a.txt'
>  >>> print x
> sytest3\Пошук.txt
>  >>>

thanks for that hint. But found that it is still not consistent or even
buggy:

After "is_platform_unicode = <auto>", scintilla displays some unicode
as you showed. but the win32-functions (e.g. MessageBox) still do not
pass through wide unicode. And pasting/inserting/parsing in scintilla
doesn't work correct:

PythonWin 2.3.5 (#62, Feb  8 2005, 16:23:02) [MSC v.1200 32 bit
(Intel)] on win32.
Portions Copyright 1994-2004 Mark Hammond (mhammond at skippinet.com.au) -
see 'Help/About PythonWin' for further copyright information.
>>> x = u'sytest3\\\u041f\u043e\u0448\u0443\u043a.txt'
>>> print x
sytest3\Пошук.txt
>>> print "sytest3\Пошук.txt"
sytest3\?????.txt

!!!

--------

Then tried in __init__.py to do more uft-8:

default_platform_encoding = "utf-8"  #"mbcs" # Will it ever ...this?
default_scintilla_encoding = "utf-8" # Scintilla _only_ supports this
ATM

Pasting around in scintilla then works correct. But MessageBox then
shows plain utf-8 encoded chars. Even german umlauts are not
displayable any more on my machine and when opening document files with
above-128 chars, Pythonwin breaks (because files are not valid utf-8
streams, I guess):

>>> Traceback (most recent call last):
  File
"C:\PYTHON23\Lib\site-packages\pythonwin\pywin\scintilla\document.py",
line 27, in OnOpenDocument
    text = f.read()
  File "C:\Python23\lib\codecs.py", line 380, in read
    return self.reader.read(size)
  File "C:\Python23\lib\codecs.py", line 253, in read
    return self.decode(self.stream.read(), self.errors)[0]
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa9 in position
19983: unexpected code byte
win32ui: OnOpenDocument() virtual handler (<bound method
SyntEditDocument.OnOpenDocument of
<pywin.framework.editor.color.coloreditor.SyntEditDocument instance at
0x00E356E8>>) raised an exception


Thus the result is: no combination provides a real improvement so far.
wide unicode in win32-functions is obviously not possible at all.  I
switch back to the original setup.

Guess I have to create special C-code for my major wide unicode needs -
especially listctrl-SetItem and TextOut-Stuff...

Or does anybody know of some existing wide-unicode functions/C-code
parallel to normal pywin32?

Robert




More information about the Python-list mailing list