string to unicode
Tim Roberts
timr at probo.com
Tue Aug 16 20:32:12 EDT 2011
Artie Ziff <artie.ziff at gmail.com> wrote:
>
>if I am using the standard csv library to read contents of a csv file
>which contains Unicode strings (short example:
>'\xe8\x9f\x92\xe8\x9b\x87'),
You need to be rather precise when talking about this. That's not a
"Unicode string" in Python terms. It's an 8-bit string. It might be UTF-8
encoding. If so, it maps to two Unicode code points, U+87D2 and U+86C7,
which are both CJK ideograms. Is that what you expected?
C:\Dev\videology\sw\viewer>python
Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> x = '\xe8\x9f\x92\xe8\x9b\x87'
>>> x.decode('utf8')
u'\u87d2\u86c7'
--
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.
More information about the Python-list
mailing list