Help with regular expression in python

Matt Funk matze999 at gmail.com
Thu Aug 18 19:03:05 EDT 2011


Hi guys,

thanks for the suggestions. I had tried the white space before as well (to no 
avail). So here is the expression i am using (based on suggestions), but still 
no success:

instance_linetype_pattern_str =\
	r'(([-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+))?\s+){32}(.+)'
instance_linetype_pattern = re.compile(instance_linetype_pattern_str)
results = instance_linetype_pattern.findall(line)
print "results: "; print results


The match i get is:
results: 
[('2.199000e+01 ', '2.199000', '.199000', 'e+01', ': (instance: 0)\t:\tsome 
description')]


btw: The line to be matched (given below) is ONE line. There are no line 
breaks (even though my email client adds them).


matt


On Thursday, August 18, 2011, Vlastimil Brom wrote:
> 2011/8/18 Matt Funk <matze999 at gmail.com>:
> > Hi,
> > i am sorry if this doesn't quite match the subject of the list. If
> > someone takes offense please point me to where this question should go.
> > Anyway, i have a problem using regular expressions. I would like to
> > match the line:
> > 
> > 1.002000e+01 2.037000e+01 2.128000e+01 1.908000e+01 1.871000e+01
> > 1.914000e+01 2.007000e+01 1.664000e+01 2.204000e+01 2.109000e+01
> > 2.209000e+01 2.376000e+01 2.158000e+01 2.177000e+01 2.152000e+01
> > 2.267000e+01 1.084000e+01 1.671000e+01 1.888000e+01 1.854000e+01
> > 2.064000e+01 2.000000e+01 2.200000e+01 2.139000e+01 2.137000e+01
> > 2.178000e+01 2.179000e+01 2.123000e+01 2.201000e+01 2.150000e+01
> > 2.150000e+01 2.199000e+01 : (instance: 0)       :       some description
> > 
> > The number of floats can vary (in this example there are 32). So what i
> > thought i'd do is the following:
> > instance_linetype_pattern_str =
> > '([-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?) {32}'
> > instance_linetype_pattern = re.compile(instance_linetype_pattern_str)
> > Basically the expression in the first major set of paranthesis matches a
> > scientific number format. The '{32}' is supposed to match the previous 32
> > times. However, it doesn't. I  can't figure out why this does not work.
> > I'd really like to understand it if someone can shed light on it.
> > 
> > thanks
> > matt
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> 
> Hi,
> the already suggested handling of whitespace with \s+ etc. at the end
> of the parenthesised patern should help;
> furhtermore, if you are using this pattern in the python source, you
> should either double all backslashes or use a raw string for the
> pattern - with r prepended before the opening quotation mark:
> pattern_str = r"..."
> in order to handle backslashes literally and not as escape character.
> hth,
> vbr




More information about the Python-list mailing list