how to express the following regular expression?

Tim Hammerquist tim at vegeta.ath.cx
Wed Nov 21 05:36:09 EST 2001


Stephen <fungho at sinaman.com> graced us by uttering:
> I want to search the following formats:
> 
> ([1234567890]*)R

So... this will match all of the following:

'1234R', '123R', '12R', '1R', 'R'

Is this what your want?  (ie, 0 or more digits followed by an 'R'?)
Or did you mean '+' instead of '*'?

> or 
> 
> P([1234567890]*)

Again:

'P1234', 'P123', 'P12', 'P1', 'P'

> this means there must be a 'P' before the number or a 'R' after the
> number. However, I think I can't use this:
> P?([1234567890]*)R?
> because the number without P and R is also matched! 

Yes, this is also a bad choice, as this matches, well, nothing. Since
the above regex doesn't require that *anything* is present in the
string, it tests true for the following:

'1234', 'P1234', '1234R', 'P1234R', 'PR',
'P', 'R', '', 'arbitrary string'

> How can I express it?

You need to know exactly what you want the regex to match before you can
write a well-crafted regex.  Assuming you want one of the two following:

  - the character 'P', followed by 1 or more digits ([0-9])
or
  - 1 or more digits ([0-9]) followed by the character 'R'

...you can use the following:

    #!/usr/local/bin/python
    import re
    my_regex = re.compile(r'(P\d+|\d+R)')

> thx!
> 
> Stephen

HTH,
Tim Hammerquist
-- 
A diplomat is man who always remembers a woman's birthday but never her age.
    -- Robert Frost



More information about the Python-list mailing list