Help with regular expression in python

Matt Funk matze999 at gmail.com
Fri Aug 19 19:33:49 CEST 2011


On Friday, August 19, 2011, Alain Ketterlin wrote:
> Matt Funk <matze999 at gmail.com> writes:
> > thanks for the suggestion. I guess i had found another way around the
> > problem as well. But i really wanted to match the line exactly and i
> > wanted to know why it doesn't work. That is less for the purpose of
> > getting the thing to work but more because it greatly annoys me off that
> > i can't figure out why it doesn't work. I.e. why the expression is not
> > matches {32} times. I just don't get it.
> 
> Because a line is not 32 times a number, it is a number followed by 31
> times "a space followed by a number". Using Jason's regexp, you can
> build the regexp step by step:
> 
> number = r"\d\.\d+e\+\d+"
> numbersequence = r"%s( %s){31}" % (number,number)
That didn't work either. Using the (modified (where the (.+) matches the end of 
the line)) expression as:

number = r"\d\.\d+e\+\d+"
numbersequence = r"%s( %s){31}(.+)" % (number,number)
instance_linetype_pattern = re.compile(numbersequence)

The results obtained are:
results: 
[(' 2.199000e+01', ' : (instance: 0)\t:\tsome description')]
so this matches the last number plus the string at the end of the line, but no 
retaining the previous numbers.

Anyway, i think at this point i will go another route. Not sure where the 
issues lies at this point.

thanks for all the help
matt


> 
> There are better ways to build your regexp, but I think this one is
> convenient to answer your question. You still have to append what will
> match the end of the line.
> 
> -- Alain.
> 
> P/S: please do not top-post
> 
> >> $ python
> >> Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
> >> [GCC 4.4.3] on linux2
> >> Type "help", "copyright", "credits" or "license" for more information.
> >> 
> >>>>> data
> >> 
> >> '1.002000e+01 2.037000e+01 2.128000e+01 1.908000e+01 1.871000e+01
> >> 1.914000e+01 2.007000e+01 1.664000e+01 2.204000e+01 2.109000e+01
> >> 2.209000e+01 2.376000e+01 2.158000e+01 2.177000e+01 2.152000e+01
> >> 2.267000e+01 1.084000e+01 1.671000e+01 1.888000e+01 1.854000e+01
> >> 2.064000e+01 2.000000e+01 2.200000e+01 2.139000e+01 2.137000e+01
> >> 2.178000e+01 2.179000e+01 2.123000e+01 2.201000e+01 2.150000e+01
> >> 2.150000e+01 2.199000e+01 : (instance: 0)       :       some
> >> description'
> >> 
> >>>>> import re
> >>>>> re.findall(r"\d\.\d+e\+\d+", data)
> >> 
> >> ['1.002000e+01', '2.037000e+01', '2.128000e+01', '1.908000e+01',
> >> '1.871000e+01', '1.914000e+01', '2.007000e+01', '1.664000e+01',
> >> '2.204000e+01', '2.109000e+01', '2.209000e+01', '2.376000e+01',
> >> '2.158000e+01', '2.177000e+01', '2.152000e+01', '2.267000e+01',
> >> '1.084000e+01', '1.671000e+01', '1.888000e+01', '1.854000e+01',
> >> '2.064000e+01', '2.000000e+01', '2.200000e+01', '2.139000e+01',
> >> '2.137000e+01', '2.178000e+01', '2.179000e+01', '2.123000e+01',
> >> '2.201000e+01', '2.150000e+01', '2.150000e+01', '2.199000e+01']




More information about the Python-list mailing list