string processing question

Scott David Daniels Scott.Daniels at Acm.Org
Thu Apr 30 20:24:51 CEST 2009

Kurt Mueller wrote:
> on a Linux system and python 2.5.1 I have the
> following behaviour which I do not understand:
> case 1
>> python -c 'a="ä"; print a ; print,"-") ; b=unicode(a, "utf8"); print,"-")'
> ä
> --ä--
> --ä---
To discover what is happening, try something like:
     python -c 'for a in "ä", unicode("ä"): print len(a), a'

I suspect that in your encoding, "ä" is two bytes long, and in
unicode it is converted to to a single character.

--Scott David Daniels
Scott.Daniels at Acm.Org

More information about the Python-list mailing list