How to Split Chinese Character with backslash representation?

Cameron Walsh cameron.walsh at gmail.com
Fri Oct 27 01:59:08 EDT 2006


limodou wrote:
> On 10/27/06, Wijaya Edward <ewijaya at i2r.a-star.edu.sg> wrote:
>>
>> Thanks but my intention is to strictly use regex.
>> Since there are separator I need to include as delimiter
>> Especially for the case like this:
>>
>> >>> str = '\xc5\xeb\xc7\xd5\xbc--FOO--BAR'
>> >>> field = list(str)
>> >>> print field
>> ['\xc5', '\xeb', '\xc7', '\xd5', '\xbc', '-', '-', 'F', 'O', 'O', '-', 
>> '-', 'B', 'A', 'R']
>>
>> What we want as the output is this instead:
>> ['\xc5', '\xeb', '\xc7', '\xd5', '\xbc','FOO','BAR]
>>
>> What's the best way to do it?
>>
> If the case is very simple, why not just replace '_' with '', for example:
> 
> str.replace('-', '')
> 
Except he appears to want the Chinese characters as elements of the 
list, and English words as elements of the list.  Note carefully the 
last two elements in his desired list.  I'm still puzzling this one...



More information about the Python-list mailing list