Iterating over unicode strings
Martin v. Loewis
martin at v.loewis.de
Mon Mar 11 00:32:19 EST 2002
Arun Sharma <arun-public at sharma-home.net> writes:
> line = u"ಡಾ|| ಶಿವರಾಮ ಲ¾ÂàÒ°à²àÒ¤"
> for c in line:
> print c
>
> fails miserably. What is the right way to do it ? I would also like to
> be able to slice the string i.e. line[i] to get the i'th character.
I'm not sure what you expect to happen, but I believe your program
works "correctly": it prints one character at a time.
Now, the question is: what did you want to happen? Apparently, you
want to use UTF-8 in your string literal. This is currently not
directly supported - Unicode literals are Latin-1 encoded. Instead, use
line = unicode("your text", "utf-8")
HTH,
Martin
More information about the Python-list
mailing list