Matching Pure Numeric and '' with Python re

Steve Holden steve at holdenweb.com
Fri Oct 27 03:51:14 EDT 2006


Wijaya Edward wrote:
> Hi,
>  
> Given this list:
>  
> list = ['0123', '1A34' , '333-' , '' ]
>  
> I want to match only this element 
> '0123' (pure numeric) and '' (empty element).
>  
> Why this construct doesn't work?
>  
>          p = re.compile("''+|[0-9]+")
>            m = p.match(word)
>            if m:
>              print word,
>  
> Namely it doesn't print 0123 and ''.
> What's wrong with my regex?
>  
The first mistake is to assume that the single quotes are a part of the 
strings - they aren't. The second mistake was to over-complicate the 
pattern. All you actually need to match is zero or more digits - but you 
need a "$" at the ned of the pattern to make sure all the target string 
has been consumed in the match (otherwise *anything* will match, since 
all strings begin with zero or more digits). Here's my test:

  >>> import re
  >>> list = ['0123', '1A34' , '333-' , '' ]
  >>> p = re.compile("\d*$")
  >>> for word in list:
  ...   if p.match(word):
  ...     print "[%s]" % word
  ...
[0123]
[]
  >>>

There is, however, an old saying to the effect that if you try to solve 
a problem with regular expressions you then have *two* problems. If this 
isn't just a learning exercise then consider:

  >>> for word in list:
  ...   if word.isdigit() or not word:
  ...     print "[%s]" % word
  ...
[0123]
[]
  >>>

less-to-go-wrong-is-always-good-ly y'rs -  steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden




More information about the Python-list mailing list