unicode strings and such
Garth Grimm
garth_grimm at hp.com
Thu Sep 13 15:46:50 EDT 2001
Ignacio Vazquez-Abrams wrote:
>>Any idea of what the str() is actually doing here? If I remove the str() statements from the
>>following code, I get different resuts (i.e. no substitutions will take place in qt). So either
>>pattern and patch (after the assignment at the beginning of the loop) aren't actually string
>>objects, or str() is doing something more than the API docs state -- "For strings, this returns the
>>string itself."
>>
>
> I still can't read the UTF-8 properly. Do the following and show me what your
> results are:
>
> ---
>
>>>>a='ãf~ãf«ãf--'
>>>>a
>>>>
> '\xe3f~\xe3f\xab\xe3f--'
>
>>>>str(a)
>>>>
> '\xe3f~\xe3f\xab\xe3f--'
>
>>>>unicode(str(a), 'utf-8')
>>>>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeError: UTF-8 decoding error: invalid data
>
> ---
Not sure if any of this will be helpful. In both cases, I'm cutting and pasting from a Unicode text editor.
With an NT console command line, I get...
Python 2.0.1 (#18, Jun 22 2001, 02:26:03) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> a='(???)(\d{3})'
>>> a
'(???)(\\d{3})'
>>> str(a)
'(???)(\\d{3})'
>>> unicode(str(a), 'utf-8')
u'(???)(\\d{3})'
Using IDLE 0.6, the characters from the cut-n-paste to initialize a look perfect (still Japanese glyphs),
but when I hit ENTER, I get the following dump...
>>> Exception in Tkinter callback
Traceback (most recent call last):
File "e:\python20\lib\lib-tk\Tkinter.py", line 1287, in __call__
return apply(self.func, args)
File "E:\PYTHON20\Tools\idle\PyShell.py", line 579, in enter_callback
self.runit()
File "E:\PYTHON20\Tools\idle\PyShell.py", line 598, in runit
more = self.interp.runsource(line)
File "E:\PYTHON20\Tools\idle\PyShell.py", line 183, in runsource
return InteractiveInterpreter.runsource(self, source, filename)
File "e:\python20\lib\code.py", line 61, in runsource
code = compile_command(source, filename, symbol)
File "e:\python20\lib\codeop.py", line 61, in compile_command
code = compile(source, filename, symbol)
UnicodeError: ASCII encoding error: ordinal not in range(128)
a='(ヘルプ)(\d{3})'
Garth
More information about the Python-list
mailing list