[ python-Bugs-1743795 ] Some incorrect national characters (Polish) in unicodedata
SourceForge.net
noreply at sourceforge.net
Sat Jun 30 19:34:06 CEST 2007
Bugs item #1743795, was opened at 2007-06-26 20:45
Message generated for change (Comment added) made by loewis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1743795&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
>Group: 3rd Party
>Status: Closed
>Resolution: Invalid
Priority: 5
Private: No
Submitted By: admindomeny (admindomeny)
Assigned to: Mark Hammond (mhammond)
Summary: Some incorrect national characters (Polish) in unicodedata
Initial Comment:
Hello,
This problem regards pythonwin (I haven't checked whether unix/commandline python is affected), Python 2.5.1.
Examples on attached screenshot.
E.g. print u'\N{LATIN SMALL LETTER A WITH CIRCUMFLEX}' prints wrong character (latin small a with some caret above it it seems) as well as
print unicodedata.name( / latin small letter a with circumflex, typed in Windows using Polish "programmer's keyboard" / ) produces 'SUPERSCRIPT ONE', which is obviously incorrect.
----------------------------------------------------------------------
>Comment By: Martin v. Löwis (loewis)
Date: 2007-06-30 19:34
Message:
Logged In: YES
user_id=21627
Originator: NO
Actually, we should close this here. Please report it through the
PythonWin bugtracker.
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2007-06-27 21:38
Message:
Logged In: YES
user_id=38388
Originator: NO
Assigning to Mark Hammond who wrote Pythonwin.
----------------------------------------------------------------------
Comment By: admindomeny (admindomeny)
Date: 2007-06-27 19:25
Message:
Logged In: YES
user_id=1829093
Originator: YES
You were correct, the attached test file for Polish national characters
shows correctt character encodings when ran in Pythonwin and edited
correctly Unicode with Polish characters from Unicode Unicode.
The problem of entering characters in Pythonwin remains, however (OS: Win
XP SP2, Polish edition): I have tried changing fonts to what are Unicode
fonts as far as I know (Times New Roman, Arial, etc), including CE fonts as
well. It doesn't work.
I made sure that Polish Programmer's Keyboard is turned on which gives me
correct encoding in almost all Windows applications, including Unicode
editors like UniRed. Still, Pythonwin shell in particular thinks that
AltGr+a (standard way of entering 'LATIN SMALL LETTER A WITH OGONEK') is
actually 'SUPERSCRIPT ONE' for example.
So, to summarize:
1. IDLE edits the text in Unicode correctly provided there's a #-*-
coding: utf-8 -*- header in first line.
2. Pythonwin executes that file correctly.
3. Pythonwin enters national characters INCORRECTLY (at least as far
Polish is concerned, but I suspect it's also the case with other
languages).
File Added: test.py
----------------------------------------------------------------------
Comment By: M.-A. Lemburg (lemburg)
Date: 2007-06-27 10:28
Message:
Logged In: YES
user_id=38388
Originator: NO
This sounds more like a problem with entry of Unicode characters in
pythonwin than the unicodedata module.
Please create a test.py file with the character using e.g. UTF-8 as source
code encoding and run that through the Python interpreter directly to see
if the problem persists.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1743795&group_id=5470
More information about the Python-bugs-list
mailing list