Unicode equality from raw_input
kmtracey at gmail.com
Sun Oct 12 04:50:42 CEST 2008
2008/10/11 Damian Johnson <atagar1 at gmail.com>
> Hi, when getting text via the raw_input method it's always a string (even
> if it contains non-ASCII characters). The problem lies in that whenever I
> try to check equality against a Unicode string it fails. I've tried using
> the unicode method to 'cast' the string to the Unicode type but this throws
> an exception:
Python needs to know the encoding of the bytestring in order to convert it
to unicode. If you don't specify an encoding, ascii is assumed, which
doesn't work for any bytestrings that actually contain non-ASCII data.
Since you are reading the string from standard input, try using the encoding
associated with stdin:
>>> a = raw_input("text: ")
>>> b = u"おはよう"
>>> import sys
>>> unicode(a,sys.stdin.encoding) == b
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list