Unicode confusion
Jerry Hill
malaclypse2 at gmail.com
Mon Jul 14 12:51:01 EDT 2008
On Mon, Jul 14, 2008 at 12:40 PM, Tim Cook <timothywayne.cook at gmail.com> wrote:
> if I say units=unicode("°"). I get
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 0:
> ordinal not in range(128)
>
> If I try x=unicode.decode(x,'utf-8'). I get
> TypeError: descriptor 'decode' requires a 'unicode' object but received
> a 'str'
>
> What is the correct way to interpret these symbols that come to me as a
> string?
Part of it depends on where you're getting them from. If they are in
your source code, just define them like this:
>>> units = u"°"
>>> print units
°
>>> print repr(units)
u'\xb0'
If they're coming from an external source, you have to know the
encoding they're being sent in. Then you can decode them into
unicode, like this:
>>> units = "°"
>>> unicode_units = units.decode('Latin-1')
>>> print repr(unicode_units)
u'\xb0'
>>> print unicode_units
°
--
Jerry
More information about the Python-list
mailing list