Unicode equality from raw_input
Karen Tracey
kmtracey at gmail.com
Sat Oct 11 22:50:42 EDT 2008
2008/10/11 Damian Johnson <atagar1 at gmail.com>
> Hi, when getting text via the raw_input method it's always a string (even
> if it contains non-ASCII characters). The problem lies in that whenever I
> try to check equality against a Unicode string it fails. I've tried using
> the unicode method to 'cast' the string to the Unicode type but this throws
> an exception:
>
Python needs to know the encoding of the bytestring in order to convert it
to unicode. If you don't specify an encoding, ascii is assumed, which
doesn't work for any bytestrings that actually contain non-ASCII data.
Since you are reading the string from standard input, try using the encoding
associated with stdin:
>>> a = raw_input("text: ")
text: おはよう
>>> b = u"おはよう"
>>> import sys
>>> unicode(a,sys.stdin.encoding) == b
True
Karen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20081011/16c970ed/attachment-0001.html>
More information about the Python-list
mailing list