Unicode question : turn "José" into u"José"
aurora
aurora00 at gmail.com
Wed Apr 5 15:57:16 EDT 2006
First of all, if you run this on the console, find out your console's
encoding. In my case it is English Windows XP. It uses 'cp437'.
C:\>chcp
Active code page: 437
Then
>>> s = "José"
>>> u = u"Jos\u00e9" # same thing in unicode escape
>>> s.decode('cp437') == u # use encoding that match your console
True
>>>
wy
> This is probably stupid and/or misguided but supposing I'm passed a
> byte-string value that I want to be unicode, this is what I do. I'm sure
> I'm missing something very important.
>
> Short version :
>
>>>> s = "José" #Start with non-unicode string
>>>> unicoded = eval("u'%s'" % "José")
>
> Long version :
>
>>>> s = "José" #Start with non-unicode string
>>>> s #Lets look at it
> 'Jos\xe9'
>>>> escaped = s.encode('string_escape')
>>>> escaped
> 'Jos\\xe9'
>>>> unicoded = eval("u'%s'" % escaped)
>>>> unicoded
> u'Jos\xe9'
>
>>>> test = u"José" #What they should have passed me
>>>> test == unicoded #Am I really getting the same thing?
> True #Yay!
>
>
>
>
More information about the Python-list
mailing list