Another re question
Stephen Kloder
stephenk at cc.gatech.edu
Tue Oct 24 14:53:30 EDT 2000
Kent Polk wrote:
> On Mon, 23 Oct 2000 20:43:20 -0400, Stephen Kloder wrote:
>
> ---------
> >>> findpid_pat = r'\012+0*\w.\w*[\t, ]+([\w_ ]+)'
> >>> re.findall(findpid_pat,sid2pid)
> ['1X4567', '1 4853', '1X0608']
> ---------
>
> Thanks. It works great in the cases I provided. Unfortunately, I
> forgot about one case - where the first name can be blank (spaces).
>
> >>> sid2pid="\n1 1979 1X4567\n00031 1 4853\n1S0959 1X0608\n 3S4267\n"
> >>> print sid2pid
>
> 1 1979 1X4567
> 00031 1 4853
> 1S0959 1X0608
> 3S4267
>
> >>> findpid_pat = r'\012+0*\w.\w*[\t, ]+([\w_ ]+)'
> >>> re.findall(findpid_pat,sid2pid)
> ['1X4567', '1 4853', '1X0608']
>
> which misses my (new) last case.
>
Assuming the first name cannot begin with a space and have nonspace characters:
>>> sid2pid="\n1 1979 1X4567\n00031 1 4853\n1S0959 1X0608\n 3S4267\n"
>>> findpid_pat = r'\012+0*..\w*[\t, ]+([\w_ ]+)'
>>> re.findall(findpid_pat,sid2pid)
['1X4567', '1 4853', '1X0608', '3S4267']
Of course, if the names are always lined up, it may be better to use split() and
string slices . . .
--
Stephen Kloder | "I say what it occurs to me to say.
stephenk at cc.gatech.edu | More I cannot say."
Phone 404-874-6584 | -- The Man in the Shack
ICQ #65153895 | be :- think.
More information about the Python-list
mailing list