Python-list Digest, Vol 61, Issue 443
Marc 'BlackJack' Rintsch
bj_666 at gmx.net
Thu Oct 30 06:58:30 EDT 2008
On Thu, 30 Oct 2008 13:50:47 +0300, Seid Mohammed wrote:
> ok
> but still i am not clear with my problem. if i test this one
> ==============
> kk ='how old are you'
>>>> len(kk)
> 15
> ==========
> but in my case
> ==========
>>>> abebe = 'አበበ በሶ በላ'
>>>> len(abebe)
> 23
> ==========
> why the lenght is 23 while I am expecting to be 9 only. becuase I have 9
> characters(including space) just typed. there must be a kind of trick
> over it.
You have typed 9 characters but they are not encoded as 9 bytes. I guess
your environment uses UTF-8 as encoding, because mine does too and:
In [124]: abebe = 'አበበ በሶ በላ'
In [125]: len(abebe)
Out[125]: 23
In [126]: s = 'አ'
In [127]: len(s)
Out[127]: 3
In [128]: s
Out[128]: '\xe1\x8a\xa0'
So that one character is encoded in three bytes. If you really want to
operate on characters instead of bytes, use `unicode` objects:
In [129]: u = abebe.decode('utf-8')
In [130]: len(u)
Out[130]: 9
In [131]: print u[0]
አ
Ciao,
Marc 'BlackJack' Rintsch
More information about the Python-list
mailing list