[Tutor] Splitting a unicode string into characters (was "(No Subject)")

Terry Carroll carroll at tjc.com
Thu Aug 25 18:52:10 CEST 2005


Jorge, please include a subject line.

On Thu, 25 Aug 2005, Jorge Louis de Castro wrote:

> What is the best way to split a unicode string in its characters? 
> Specifically, having this unicode chinese string
> 
> u'\u8C01\u4ECA\u5929\u7A7F\u4EC0\u4E48

I'm assuming you've actually got the close-quote there, i.e.:

>>> s=u'\u8C01\u4ECA\u5929\u7A7F\u4EC0\u4E48'

> I want to either split all its characters:
> [\u8C01,\u4ECA,\u5929,\u7A7F,\u4EC0,\u4E48]

>>> l=list(s)
>>> l
[u'\u8c01', u'\u4eca', u'\u5929', u'\u7a7f', u'\u4ec0', u'\u4e48']

> or insert a space between each character:
> \u8C01 \u4ECA \u5929 \u7A7F \u4EC0 \u4E48

>>> s_with_spaces = ' '.join(l)
>>> s_with_spaces
u'\u8c01 \u4eca \u5929 \u7a7f \u4ec0 \u4e48'




More information about the Tutor mailing list