[Tutor] simple unicode question
Albert-Jan Roskam
fomcl at yahoo.com
Fri Aug 22 23:10:21 CEST 2014
Hi,
I have data that is either floats or byte strings in utf-8. I need to cast both to unicode strings. I am probably missing something simple, but.. in the code below, under "float", why does [B] throw an error but [A] does not?
# Python 2.7.3 (default, Feb 27 2014, 19:39:10) [GCC 4.7.2] on linux2
>>> help(unicode)
Help on class unicode in module __builtin__:
class unicode(basestring)
| unicode(string [, encoding[, errors]]) -> object
|
| Create a new Unicode object from the given encoded string.
| encoding defaults to the **current default string encoding**.
| errors can be 'strict', 'replace' or 'ignore' and defaults to 'strict'.
# ...
>>> import sys
>>> sys.getdefaultencoding()
'ascii'
# float: cannot explicitly give encoding, even if it's the default
>>> value = 1.0
>>> unicode(value) # [A]
u'1.0'
>>> unicode(value, sys.getdefaultencoding()) # [B]
Traceback (most recent call last):
File "<pyshell#22>", line 1, in <module>
unicode(value, sys.getdefaultencoding())
TypeError: coercing to Unicode: need string or buffer, float found
>>> unicode(value, "utf-8")
# (... also TypeError)
# byte string: must explicitly give encoding (which makes perfect sense)
>>> value = '\xc3\xa9'
>>> unicode(value)
Traceback (most recent call last):
File "<pyshell#31>", line 1, in <module>
unicode(value)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
>>> unicode(value, "utf-8")
u'\xe9'
Thank you!
Regards,
Albert-Jan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the Tutor
mailing list