[BangPypers] regular expression for Indian landline numbers

Anand Balachandran Pillai abpillai at gmail.com
Thu Nov 25 12:58:02 CET 2010

On Thu, Nov 25, 2010 at 3:11 PM, Kenneth Gonsalves <lawgon at au-kbc.org>wrote:

> hi,
> on looking at the telephone book, Indian landline numbers have three
> forms
> 3 digit STD code followed by 8 digits
> 4 digit STD code followed by 7 digits
> 5 digit STD code followed by 6 digits
> the first digit of the STD code has to be 0. The first digit of the
> landline number starts from 1-6. Of course I am not dead sure of the
> starting numbers, but I have seen mobile numbers starting with 9 and 8,
> and I think 7 is also reserved for mobile. I could not find any
> authorative info on this. This is the re:
> r'(^0\d{2}[-\s]{1}[1-6]{1}\d{7})|(^0\d{3}[-\s]{1}[1-6]{1}\d{6})|(^0
> \d{4}[-\s]{1}[1-6]{1}\d{5})'
> any clues on how to make it shorter? And any info as to whether my
> assumptions as to the landline numbers is correct?

 Your regex is complicated because you are putting all rules
 into a single regex. There are different ways to make this shorter.

 The best option according to me is to define two regexes, one
for the STD code part and the other for the number part. So in
this case, it will be like,

>>> std=re.compile(r'(^0\d{2,4})')
>>> num=re.compile(r'([1-6]{1}\d{6,8})')
>>> number='080-25936609'

Find out the lengths of the std and number parts.

>>> l1=len(std.findall(number.split('-')[0])[0])
>>> l1
>>> l2=len(num.findall(number.split('-')[1])[0])
>>> l2

And do the rest in code.

If (((l1==3) and (l2==8)) or ... ):
    print 'valid number'
    print 'invalid number'

The second option which I don't favor is to split the rule into 3 regexes
instead of ORing them together and then do a simple OR in code.

>>> r1=re.compile(r'(^0\d{2}[-\s]{1}[1-6]{1}\d{7})')
>>> r2=re.compile(r'(^0\d{3}[-\s]{1}[1-6]{1}\d{6})')
>>> r3=re.compile(r'(^0\d{4}[-\s]{1}[1-6]{1}\d{5})')

Then of course,

if (r1.match(num) or r2.match(num) or r3.match(num)):
   print 'valid'
  print 'invalid'

If you can't use *any* code and absolutely has to do this directly
in regex, revert back to your original one. There is no other way
to do this using the re module.


> --
> regards
> Kenneth Gonsalves
> _______________________________________________
> BangPypers mailing list
> BangPypers at python.org
> http://mail.python.org/mailman/listinfo/bangpypers


More information about the BangPypers mailing list