Python beginner, unicode encode/decode Q

anonymous anonymous at anonymous.com
Mon Jul 14 08:51:06 EDT 2008


1 Objective to write little programs to help me learn German.  See code
after numbered comments. //Thanks in advance for any direction or
suggestions.

tk

2  Want keyboard answer input, for example:  
 
answer_str  = raw_input(' Enter answer > ') Herr  Üü

[ I keyboard in the following characters Herr Üü ]
print answer_str
Output on screen is > Herr Üü

3   history 1 and 2  code run interactively under Debian Linux Python
2.4 and interactively under windows98, first edition IDLE, Python 2.3.5
and it works.

4  history 3 and 4 code run from within a .py file produce different
output from example in book.
 
5 want to operate under Debian Linux but because the program failed
under Linux when I tried to run the code from a file in Linux Python, I
thougt I should fire up the win98 Idle/python program and try it to see
if ran there but it failed, too from within a file.

6 The sample code is from page 108-109 of:   "Python for Dummies"
      It says in the book:  "Python's file objects and StringIO objects
don't support raw Unicode; the usual workaround is to encode Unicode as
UTF-8 before saving it to a file or stringIO object.   
The sample code from the book is French as indicate here but trying
German produces the same result.

7 I have searched the net under all the keywords but this is as close as
I get to accomplishing my task.  I suspect I may not be understanding: 
StringIO objects don't support raw Unicode, but I don't know.


#_*_ coding: utf-8 _*_

# code run under linux debian  interactively from a terminal and works 

print " u'Libert\u00e9' "

# y = raw_input('Enter >')  commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive

 >>> y = raw_input ('>')
 >Libert\xc3\xa9
 >>> q = 'Libert\xc3\xa9'
 >>> q.decode('utf-8')
u'Libert\xe9'
 >>> print q
Liberté
 >>>

[  screen output is next line ]

Lberté



history 2 
# code run under win98, first edition, within IDLE interactively and
succeeded in produce correct results. 


# y = raw_input('Enter >')  commented out

y = u'Lbert\u00e9'
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

history 1 works and here is the screen copy of interactive

 >>> y = raw_input ('>')
 >Libert\xc3\xa9
 >>> q = 'Libert\xc3\xa9'
 >>> q.decode('utf-8')
u'Libert\xe9'
 >>> print q
Liberté
 >>>

[  screen output is next line ]

Lberté




# history 3

# this code is run from within idle on win98 and inside a python file.  
#  The code DOES NOT produce the proper outout. 

#_*_ coding: utf-8 _*_

# print "u'Libert\u00e9'"  printed to screen

y = raw_input('Enter >') 

# y = u'Lbert\u00e9' commented out
 
y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is  on the lines  below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string 
Enter >u'Libert\u00e9'

u'Libert\u00e9'

The code DOES NOT produce Liberté but instead produce u'Libert\u00e9'

# history 4

# this code is run from within terminal on Debian linux   inside a
python file.  
# The code does not produce proper outout but produces the same output
as run on
# windows. 

#_*_ coding: utf-8 _*_

print "u'Libert\u00e9'"  printed to screen

y = raw_input('Enter >') 

# y = u'Lbert\u00e9' commented out

y.encode('utf-8')
q = y.encode('utf-8')
q.decode('utf-8')
print q.decode('utf-8')

# output is  on the lines  below was produced on the screen after run

enter u'Libert\u00e9' on screen to copy into into y string 
Enter >u'Libert\u00e9'
u'Libert\u00e9'

The code DID NOT produce Liberté but instead produce u'Libert\u00e9'



More information about the Python-list mailing list