regular expression problem
Karsten Hilbert
Karsten.Hilbert at gmx.net
Mon Oct 29 07:04:22 EDT 2018
On Sun, Oct 28, 2018 at 11:57:48PM +0100, Brian Oney wrote:
> On Sun, 2018-10-28 at 22:04 +0100, Karsten Hilbert wrote:
> > [^<:]
>
> Would a simple regex work?
This brought about the solution.
However, not this way:
> >>> import re
> >>> t = '$<name::options::range>$'
> >>> re.findall('[^<>:$]+', t)
> ['name', 'options', 'range']
because I am not trying to parcel out the placeholder *parts*
(but rather the placeholders from a given line).
I eventually figured that denoting the parsing stages
differently made for easier matching. Rather than
$<>$
$<<>>$
$<<<>>>$
do this
$1<>1$
$2<>2$
$3<>3$
which makes it way less ambiguous, and more matchable:
regexen = [
r'\$1{0,1}<[^<].*?>1{0,1}\$',
r'\$2<[^<].*?>2\$',
r'\$3<[^<].*?>3\$'
]
The [^<] part ("the single < is NOT to be followed directly
by another <") is actually superfluous but does protect
against legacy document templates still having
$<<(<)...(>)>>$ in them.
$<>$ is still retained as an alias for $1<>1$ because there is
A LOT of them in existing document templates. It is
normalized explicitely inside Python before fillin values are
generated.
Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
More information about the Python-list
mailing list