Can one make 'in' ungreedy?
Chris Green
cl at isbd.net
Mon May 18 08:39:18 EDT 2020
Larry Martell <larry.martell at gmail.com> wrote:
> On Mon, May 18, 2020 at 7:05 AM Chris Green <cl at isbd.net> wrote:
> >
> > I have a strange/minor problem in a Python program I use for mail
> > filtering.
> >
> > One of the ways it classifies messages is by searching for a specific
> > string in square brackets [] in the Subject:, the section of code that
> > does this is:-
> >
> > #
> > #
> > # copy the fields from the filter configuration file into better named variables
> > #
> > nm = fld[0] # name/alias
> > dd = fld[1] + "/" # destination directory
> > tocc = fld[2].lower() # list address
> > sbstrip = '[' + fld[3] + ']' # string to match in and/or strip out of subject
> > #
> > #
> > # see if the filter To/CC column matches the message To: or Cc: or if sbstrip is in Subject:
> > #
> > if (tocc in msgcc or tocc in msgto or sbstrip in msgsb):
> > #
> > #
> > # set the destination directory
> > #
> > dest = mldir + dd + nm
> > #
> > #
> > # Strip out list name (4th field) from subject if it's there
> > #
> > if sbstrip in msgsb:
> > msg.replace_header("Subject", msgsb.replace(sbstrip, ''))
> > #
> > #
> > # we've found a match so assume we won't get another
> > #
> > break
> >
> >
> > So in the particular case where I have a problem sbstrip is "[Ipswich
> > Recycle]" and the Subject: is "[SPAM] [Ipswich Recycle] OFFER:
> > Lawnmower (IP11)". The match isn't found, presumably because 'in' is
> > greedy and sees "[SPAM] [Ipswich Recycle]" which isn't a match for
> > "[Ipswich Recycle]".
> >
> > Other messages with "[Ipswich Recycle]" in the Subject: are being
> > found and filtered correctly, it seems that it's the presence of the
> > "[SPAM]" in the Subject: that's breaking things.
> >
> > Is this how 'in' should work, it seems a little strange if so, not
> > intuitively how one would expect 'in' to work. ... and is there any
> > way round the issue except by recoding a separate test for the
> > particular string search where this can happen?
>
> >>> sbstrip = "[Ipswich Recycle]"
> >>> subject = "[SPAM] [Ipswich Recycle] OFFER:Lawnmower (IP11)"
> >>> sbstrip in subject
> True
>
> Clearly something else is going on in your program. I would run it in
> the debugger and look at the values of the variables in the case when
> it fails when you think it should succeed. I think you will see the
> variables do not hold what you think they do.
Thanks for taking the trouble to look. It's a *bit* difficult to run
in the debugger as the program is a filter triggered by incoming
E-Mail messages. However I think I can fire stuff at it via stdin so
I'll see what I can fathon out doing that.
--
Chris Green
ยท
More information about the Python-list
mailing list