Iterating over unicode strings

Martin v. Loewis martin at v.loewis.de
Mon Mar 11 00:32:19 EST 2002


Arun Sharma <arun-public at sharma-home.net> writes:

> line = u"ಡಾ|| ಶಿವರಾಮ ಲ¾ÂàÒ°à²àÒ¤"
> for c in line:
>      print c
> 
> fails miserably. What is the right way to do it ? I would also like to
> be able to slice the string i.e. line[i] to get the i'th character.

I'm not sure what you expect to happen, but I believe your program
works "correctly": it prints one character at a time.

Now, the question is: what did you want to happen? Apparently, you
want to use UTF-8 in your string literal. This is currently not
directly supported - Unicode literals are Latin-1 encoded. Instead, use

line = unicode("your text", "utf-8")

HTH,
Martin




More information about the Python-list mailing list