Another re question

Stephen Kloder stephenk at cc.gatech.edu
Tue Oct 24 14:53:30 EDT 2000


Kent Polk wrote:

> On Mon, 23 Oct 2000 20:43:20 -0400, Stephen Kloder wrote:
>
> ---------
>  >>> findpid_pat = r'\012+0*\w.\w*[\t, ]+([\w_ ]+)'
>  >>> re.findall(findpid_pat,sid2pid)
>  ['1X4567', '1 4853', '1X0608']
> ---------
>
> Thanks. It works great in the cases I provided.  Unfortunately, I
> forgot about one case - where the first name can be blank (spaces).
>
>  >>> sid2pid="\n1 1979 1X4567\n00031  1 4853\n1S0959 1X0608\n       3S4267\n"
>  >>> print sid2pid
>
>  1 1979 1X4567
>  00031  1 4853
>  1S0959 1X0608
>         3S4267
>
>  >>> findpid_pat = r'\012+0*\w.\w*[\t, ]+([\w_ ]+)'
>  >>> re.findall(findpid_pat,sid2pid)
>  ['1X4567', '1 4853', '1X0608']
>
> which misses my (new) last case.
>

Assuming the first name cannot begin with a space and have nonspace characters:
>>> sid2pid="\n1 1979 1X4567\n00031  1 4853\n1S0959 1X0608\n       3S4267\n"
>>> findpid_pat = r'\012+0*..\w*[\t, ]+([\w_ ]+)'
>>> re.findall(findpid_pat,sid2pid)
['1X4567', '1 4853', '1X0608', '3S4267']

Of course, if the names are always lined up, it may be better to use split() and
string slices . . .


--
Stephen Kloder               |   "I say what it occurs to me to say.
stephenk at cc.gatech.edu       |      More I cannot say."
Phone 404-874-6584           |   -- The Man in the Shack
ICQ #65153895                |            be :- think.





More information about the Python-list mailing list