Another re question

Andrew Kuchling akuchlin at mems-exchange.org
Tue Oct 24 14:36:21 EDT 2000


kent at tiamat.goathill.org (Kent Polk) writes:
>  >>> findpid_pat = r'\012+\d |0*\w*[\t, ]+([\w ]+)'
> I don't understand how a empty string matches in this last
>case. Separately they are:
> 
>  >>> findpid_pat = r'\012+\d *\w*[\t, ]+([\w ]+)'
>  >>> findpid_pat = r'\012+0*\w*[\t, ]+([\w ]+)'

Oh no they're not; they're '\d ' and '0*\w*[\t, ]+([\w ]+)'.
'|' has lower precedence than concatenation, so everything after the |
is in the second branch.
Try:
findpid_pat = r'\012+(?:\d |0*)\w*[\t, ]+([\w ]+)'

(How I caught this: by uncommenting the p.dump() commend in
Lib/sre_parse.py.  It would be nice to have a way to request a dump of
the bytecode for an SRE pattern.)

--amk




More information about the Python-list mailing list