regular expression problem

Karsten Hilbert Karsten.Hilbert at gmx.net
Mon Oct 29 04:02:32 EDT 2018


On Sun, Oct 28, 2018 at 11:14:15PM +0000, MRAB wrote:

> > - lines can contain several placeholders
> > 
> > - placeholders start and end with '$'
> > 
> > - placeholders are parsed in three passes
> > 
> > - the pass in which a placeholder is parsed is denoted by the number of '<' and '>' next to the '$':
> > 
> > 	$<...>$ / $<<...>>$ / $<<<...>>>$
> > 
> > - placeholders for different parsing passes must be nestable:
> > 
> > 	$<<<...$<...>$...>>>$
> > 	....
> > 	(lower=earlier parsing passes will be inside)
> > 
> > - the internal structure is "name::options::range"
> > 
> > 	$<name::options::range>$
> > 
> > - name will *not* contain '$' '<' '>' ':'
> > 
> > - range can be either a length or a "from-until"
> > 
> > - a length will be a positive integer (no bounds checking)
> > 
> > - "from-until" is: a positive integer, a '-', and a positive integer (no sanity checking)
> > 
> > - options needs to be able to contain nearly anything, except '::'
> > 
> > 
> > Is that sufficiently defined and helpful to design the regular expression ?
> > 
> How can they be nested inside one another?
> Is the string scanned, placeholders filled in for that level, and then the
> string scanned again for the next level? (That would mean that the fill
> value itself will be scanned in the next pass.)

Exactly. But *different* levels can be nested inside each other.

> You could try matching the top level, for each match then match the next
> level, and for each of those matches then match for the final level.

So I do.

> Trying to do it all in one regex is usually a bad idea.

Right, I am not trying to do that. I was, however, worried
that I need to make the expression not "trip over" fragments
of what might seem to constitute part of another placeholder.

	$<<ph_1::option=$<ph_2::option=3::10>$::15>>$

Pass 1 might fill in to:

	$<<ph_1::option=3 '>s'::15>>$

and I was worried to make sure the second pass does not stop here:

	$<<ph_1::option=3 '>s'::15>>$
                       ^

Logically it should not because

	>s'::15>>$

does not match

	::\d*>>$

but I am not sure how to tell it that :-)

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B



More information about the Python-list mailing list