regular expression for integer and decimal numbers

Bengt Richter bokr at oz.net
Sun Sep 26 01:03:17 CEST 2004


On 25 Sep 2004 13:13:22 -0700, gary.wilson at gmail.com (gary) wrote:

>Peter Hansen <peter at engcorp.com> wrote in message news:<pbadnZrDHOinY87cRVn-jg at powergate.ca>...
>> gary wrote:
>> > I want to pick all intergers and decimal numbers out of a string.
>> > Would this be the most correct regular expression to use?
>> > 
>> > "\d+\.?\d*"
>> 
>> Examples, including the most extreme cases you want to handle,
>> are always a good idea.
>> 
>> -Peter
>
>Here is an example of what I will be dealing with:
>"""
>TOTAL FIRST DOWNS                                     19        21
>   By Rushing                                         11         6
>   By Passing                                          6        10
>   By Penalty                                          2         5
>THIRD DOWN EFFICIENCY                           4-11-36%  6-14-43%
>FOURTH DOWN EFFICIENCY                            0-1-0%    0-0-0%
>TOTAL NET YARDS                                      379       271
>   Total Offensive Plays (inc. times thrown passing)  58        63
>   Average gain per offensive play                   6.5       4.3
>NET YARDS RUSHING                                    264       115
>"""
>
>I can only hope that they were nice and put a leading zero in front of
>numbers less than 1.

Are you sure you want to throw away all the info implicit in the structure of that data?
How about the columns? Will you get other input with more columns? Otherwise if your
numeric fields are as they appear, maybe just

 >>> def extract(s):
 ...     for a in s.split():
 ...         if not a[0].isdigit(): continue
 ...         if a.endswith('%'):
 ...             for i in map(int,a[:-1].split('-')): yield i
 ...         elif '.' in a: yield float(a)
 ...         else: yield int(a)
 ...
 >>> s = (
 ... """
 ... TOTAL FIRST DOWNS                                     19        21
 ...    By Rushing                                         11         6
 ...    By Passing                                          6        10
 ...    By Penalty                                          2         5
 ... THIRD DOWN EFFICIENCY                           4-11-36%  6-14-43%
 ... FOURTH DOWN EFFICIENCY                            0-1-0%    0-0-0%
 ... TOTAL NET YARDS                                      379       271
 ...    Total Offensive Plays (inc. times thrown passing)  58        63
 ...    Average gain per offensive play                   6.5       4.3
 ... NET YARDS RUSHING                                    264       115
 ... """
 ... )
 >>> for num in extract(s): print num,
 ...
 19 21 11 6 6 10 2 5 4 11 36 6 14 43 0 1 0 0 0 0 379 271 58 63 6.5 4.3 264 115

But I doubt that's what you really want ;-)

Regards,
Bengt Richter



More information about the Python-list mailing list