[Tutor] re module / separator
Serdar Tumgoren
zstumgoren at gmail.com
Wed Jun 24 23:57:59 CEST 2009
As usual, Kent Johnson has swooped in an untangled the mess with a
clear explanation.
By the time a regex gets this complicated, I typically start thinking
of ways to simplify or avoid them altogether.
Below is the code I came up with. It goes through some gymnastics and
can surely stand improvement, but it seems to get the job done.
Suggestions are welcome.
In [83]: text
Out[83]: 'a2345b. f325. a45453b. a325643b. a435643b. g234324b.'
In [84]: textlist = text.split()
In [85]: textlist
Out[85]: ['a2345b.', 'f325.', 'a45453b.', 'a325643b.', 'a435643b.', 'g234324b.']
In [86]: newlist = []
In [87]: pat = re.compile(r'a\w+b\.')
In [88]: for item in textlist:
....: if pat.match(item):
....: newlist.append(item)
....: else:
....: newlist.append("|")
....:
....:
In [89]: newlist
Out[89]: ['a2345b.', '|', 'a45453b.', 'a325643b.', 'a435643b.', '|']
In [90]: lastlist = ''.join(newlist)
In [91]: lastlist
Out[91]: 'a2345b.|a45453b.a325643b.a435643b.|'
In [92]: lastlist.rstrip("|").split("|")
Out[92]: ['a2345b.', 'a45453b.a325643b.a435643b.']
More information about the Tutor
mailing list