unicode strings and such

Garth Grimm garth_grimm at hp.com
Thu Sep 13 15:46:50 EDT 2001


Ignacio Vazquez-Abrams wrote:

>>Any idea of what the str() is actually doing here?  If I remove the str() statements from the
>>following code, I get different resuts (i.e. no substitutions will take place in qt).  So either
>>pattern and patch (after the assignment at the beginning of the loop) aren't actually string
>>objects, or str() is doing something more than the API docs state -- "For strings, this returns the
>>string itself."
>>
> 
> I still can't read the UTF-8 properly. Do the following and show me what your
> results are:
> 
> ---
> 
>>>>a='ãf~ãf«ãf--'
>>>>a
>>>>
> '\xe3f~\xe3f\xab\xe3f--'
> 
>>>>str(a)
>>>>
> '\xe3f~\xe3f\xab\xe3f--'
> 
>>>>unicode(str(a), 'utf-8')
>>>>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeError: UTF-8 decoding error: invalid data
> 
> ---
Not sure if any of this will be helpful.  In both cases, I'm cutting and pasting from a Unicode text editor.


With an NT console command line, I get...

Python 2.0.1 (#18, Jun 22 2001, 02:26:03) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
 >>> a='(???)(\d{3})'
 >>> a
'(???)(\\d{3})'
 >>> str(a)
'(???)(\\d{3})'
 >>> unicode(str(a), 'utf-8')
u'(???)(\\d{3})'

Using IDLE 0.6, the characters from the cut-n-paste to initialize a look perfect (still Japanese glyphs),

but when I hit ENTER, I get the following dump...

 >>> Exception in Tkinter callback
Traceback (most recent call last):
   File "e:\python20\lib\lib-tk\Tkinter.py", line 1287, in __call__
     return apply(self.func, args)
   File "E:\PYTHON20\Tools\idle\PyShell.py", line 579, in enter_callback
     self.runit()
   File "E:\PYTHON20\Tools\idle\PyShell.py", line 598, in runit
     more = self.interp.runsource(line)
   File "E:\PYTHON20\Tools\idle\PyShell.py", line 183, in runsource
     return InteractiveInterpreter.runsource(self, source, filename)
   File "e:\python20\lib\code.py", line 61, in runsource
     code = compile_command(source, filename, symbol)
   File "e:\python20\lib\codeop.py", line 61, in compile_command
     code = compile(source, filename, symbol)
UnicodeError: ASCII encoding error: ordinal not in range(128)
a='(ヘルプ)(\d{3})'

Garth






More information about the Python-list mailing list