[Tutor] Re: Question on Regular Expression Help

Magnus Lycka magnus@thinkware.se
Sat Feb 8 11:23:01 2003


Alfred Milgrom wrote:
 > def isnumber(str):
 >     digits = '-0123456789'
 >     for letter in str:
 >         if letter not in digits:
 >             return 0
 >     return 1

This will report 1-----4 as being an integer, right?
But +42 is not an integer, right? This can be fixed of course.

 >>> def isnumber(str):
...     if str[0] in '+-':
...             start = 1
...     else:
...             start = 0
...     for c in str[start:]:
...             if c not in '0123456789':
...                     return False
...     return True

Don Arnold wrote:
> >>> def isInt(aString):
>  try:
>   int(aString)
>   return True
>  except:
>   return False

But note that int(3.14) => 3 ! Is that what you want? Is 3.14 an integer?

I think the code below works, but I don't really like it... I'm a bit
allergic to comparing floats for equality, even though I think it
should work in this particular case. (That is, if you think that
something like "2.0000000000000000000000001" should be accepted as
an int since it's so close that the floating point mechanism can't
see a difference.)

 >>> def isInt(s):
...     try:
...             return int(s) == float(s)
...     except:
...             return False

I think

import re
def isInteger(s):
     return re.match(r'[+-]?\d+$', s)

is a better choice. But of course, it depends on what you
mean by an integer. This code finds what it mathematically
and integer. It might not fit in a Python int. The same is
true for the first version.

Anyway, regular expressions are part of a programmers toolbox.
Don't avoid them like the plague when they add value. Learn
them!

 From a performance point of view, the try/except version is
the faster when the number is really and integer, but the
loop version if faster in finding non-integer strings. The
regular expression version is the slowest, but I think regular
expression might win big if applied to a bigger problem, like
finding all integers in a big string with something like

re.findall(r'\s([+-]?\d+)\s', s)


-- 
Magnus Lycka, Thinkware AB
Alvans vag 99, SE-907 50 UMEA, SWEDEN
phone: int+46 70 582 80 65, fax: int+46 70 612 80 65
http://www.thinkware.se/  mailto:magnus@thinkware.se