Split on multiple delimiters, and also treat consecutive delimiters as a single delimiter?
Victor Hooi
victorhooi at gmail.com
Tue Jul 28 09:55:08 EDT 2015
I have a line that looks like this:
14 *0 330 *0 760 411|0 0 770g 1544g 117g 1414 computedshopcartdb:103.5% 0 30|0 0|1 19m 97m 1538 ComputedCartRS PRI 09:40:26
I'd like to split this line on multiple separators - in this case, consecutive whitespace, as well as the pipe symbol (|).
If I run .split() on the line, it will split on consecutive whitespace:
In [17]: f.split()
Out[17]:
['14',
'*0',
'330',
'*0',
'760',
'411|0',
'0',
'770g',
'1544g',
'117g',
'1414',
'computedshopcartdb:103.5%',
'0',
'30|0',
'0|1',
'19m',
'97m',
'1538',
'ComputedCartRS',
'PRI',
'09:40:26']
If I try to run .split(' |'), however, I get:
f.split(' |')
Out[18]: [' 14 *0 330 *0 760 411|0 0 770g 1544g 117g 1414 computedshopcartdb:103.5% 0 30|0 0|1 19m 97m 1538 ComputedCartRS PRI 09:40:26']
I know the regex library also has a split, unfortunately, that does not collapse consecutive whitespace:
In [19]: re.split(' |', f)
Out[19]:
['',
'',
'',
'',
'14',
'',
'',
'',
'',
'*0',
'',
'',
'',
'330',
'',
'',
'',
'',
'*0',
'',
'',
'',
'',
'760',
'',
'',
'411|0',
'',
'',
'',
'',
'',
'',
'0',
'',
'',
'770g',
'',
'1544g',
'',
'',
'117g',
'',
'',
'1414',
'computedshopcartdb:103.5%',
'',
'',
'',
'',
'',
'',
'',
'',
'',
'0',
'',
'',
'',
'',
'',
'30|0',
'',
'',
'',
'',
'0|1',
'',
'',
'',
'19m',
'',
'',
'',
'97m',
'',
'1538',
'ComputedCartRS',
'',
'PRI',
'',
'',
'09:40:26']
Is there an easy way to split on multiple characters, and also treat consecutive delimiters as a single delimiter?
More information about the Python-list
mailing list