Why is regexp not working?
python at mrabarnett.plus.com
Fri Jul 4 11:57:19 EDT 2014
On 2014-07-04 13:27, Florian Lindner wrote:
> Hello,
> I have that piece of code:
> def _split_block(self, block):
> cre = [re.compile(r, flags = re.MULTILINE) for r in self.regexps]
> block = "".join(block)
> print(block)
> print("-------------------")
> for regexp in cre:
> match = regexp.match(block)
> for grp in regexp.groupindex:
> data = match.group(grp) if match else None
> self.data[grp].append(data)
> block is a list of strings, terminated by \n. self.regexps:
> self.regexps = [r"it (?P<coupling_iterations>\d+) .* dt complete yes |
> write-iteration-checkpoint |",
> r"it (?P<it_read_ahead>\d+) read ahead"
> If I run my program it looks like that:
> it 1 ahadf dt complete yes | write-iteration-checkpoint |
> Timestep completed
> -------------------
> it 1 read ahead
> it 2 ahgsaf dt complete yes | write-iteration-checkpoint |
> Timestep completed
> -------------------
> it 4 read ahead
> it 3 dfdsag dt complete yes | write-iteration-checkpoint |
> Timestep completed
> -------------------
> it 9 read ahead
> it 4 dsfdd dt complete yes | write-iteration-checkpoint |
> Timestep completed
> -------------------
> it 16 read ahead
> -------------------
> {'it_read_ahead': [None, '1', '4', '9', '16'], 'coupling_iterations': ['1',
> None, None, None, None]}
> it_read_ahead is always matched when it should (all blocks but the first).
> But why is the regexp containing coupling_iterations only matched in the
> first block?
> I tried different combinations using re.match vs. re.search and with or
> without re.MULTILINE.
The character '|' is a metacharacter that separates alternatives. For
example, the regex 'a|b' will match 'a' or b'.
Your regexes end with '|', which means that they will match an empty
string at the start of the target string.
More information about the Python-list
mailing list