Help with regular expression in python
Matt Funk
matze999 at gmail.com
Thu Aug 18 19:03:05 EDT 2011
Hi guys,
thanks for the suggestions. I had tried the white space before as well (to no
avail). So here is the expression i am using (based on suggestions), but still
no success:
instance_linetype_pattern_str =\
r'(([-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+))?\s+){32}(.+)'
instance_linetype_pattern = re.compile(instance_linetype_pattern_str)
results = instance_linetype_pattern.findall(line)
print "results: "; print results
The match i get is:
results:
[('2.199000e+01 ', '2.199000', '.199000', 'e+01', ': (instance: 0)\t:\tsome
description')]
btw: The line to be matched (given below) is ONE line. There are no line
breaks (even though my email client adds them).
matt
On Thursday, August 18, 2011, Vlastimil Brom wrote:
> 2011/8/18 Matt Funk <matze999 at gmail.com>:
> > Hi,
> > i am sorry if this doesn't quite match the subject of the list. If
> > someone takes offense please point me to where this question should go.
> > Anyway, i have a problem using regular expressions. I would like to
> > match the line:
> >
> > 1.002000e+01 2.037000e+01 2.128000e+01 1.908000e+01 1.871000e+01
> > 1.914000e+01 2.007000e+01 1.664000e+01 2.204000e+01 2.109000e+01
> > 2.209000e+01 2.376000e+01 2.158000e+01 2.177000e+01 2.152000e+01
> > 2.267000e+01 1.084000e+01 1.671000e+01 1.888000e+01 1.854000e+01
> > 2.064000e+01 2.000000e+01 2.200000e+01 2.139000e+01 2.137000e+01
> > 2.178000e+01 2.179000e+01 2.123000e+01 2.201000e+01 2.150000e+01
> > 2.150000e+01 2.199000e+01 : (instance: 0) : some description
> >
> > The number of floats can vary (in this example there are 32). So what i
> > thought i'd do is the following:
> > instance_linetype_pattern_str =
> > '([-+]?(\d+(\.\d*)?|\.\d+)([eE][-+]?\d+)?) {32}'
> > instance_linetype_pattern = re.compile(instance_linetype_pattern_str)
> > Basically the expression in the first major set of paranthesis matches a
> > scientific number format. The '{32}' is supposed to match the previous 32
> > times. However, it doesn't. I can't figure out why this does not work.
> > I'd really like to understand it if someone can shed light on it.
> >
> > thanks
> > matt
> > --
> > http://mail.python.org/mailman/listinfo/python-list
>
> Hi,
> the already suggested handling of whitespace with \s+ etc. at the end
> of the parenthesised patern should help;
> furhtermore, if you are using this pattern in the python source, you
> should either double all backslashes or use a raw string for the
> pattern - with r prepended before the opening quotation mark:
> pattern_str = r"..."
> in order to handle backslashes literally and not as escape character.
> hth,
> vbr
More information about the Python-list
mailing list