[Mailman-Developers] Topic regexps

Mark Sapiro msapiro at value.net
Sat Jun 3 05:18:51 CEST 2006


John W. Baxter wrote:

>On 5/25/06 8:29 PM, "Mark Sapiro" <msapiro at value.net> wrote:
>
>> I've thought about this some more and what I'm currently thinking is if
>> the topic regexp is multiline, leave it as is in topics, but before
>> compiling it for use, split the lines and then rejoin them with "|",
>> and compile not in VERBOSE mode.
>
>I think you need to strip any trailing | characters from the strings in the
>resulting list before joining with |.
>
>Otherwise, you might build one||two||three
>Which clearly isn't what is wanted.
>
>Hmmm...I guess I mean "up to one trailing | character"...in case someone
>really wanted one||two for some really strange reason.


I actually don't think that's necessary. My reasoning is that the
descriptions imply that multiple lines are joined with "|" without
having to explicitly put "|" in the pattern.

The changes I intend to put in Mailman 2.2 will include code in
versions.py to convert all presumably working multi-line patterns
(verbose mode) into a single line equivalent, non-verbose pattern.
Thus, an existing topics regexp such as
'one|\ntwo|\nthree' will be converted during the 2.1.x to 2.2 upgrade
into 'one|two|three'. Thus the new code in Tagger.py will not see
multiple lines and won't change the pattern at all.

So the only future multi-line patterns will be new ones and presumably
the person creating them will not be motivated to put trailing "|" on
the lines and if he/she does anyway, testing should reveal that is
wrong.

This leads me to the next question. I have written
Mailman.Utils.strip_verbose_pattern() to strip comments and whitespace
from a verbose pattern and return a single-line, non-verbose
equivalent.

I think the thing to do with existing topics is the following:


--- versions.py	(revision 7905)
+++ versions.py	(working copy)
@@ -307,6 +307,16 @@
             pass
         else:
             l.digest_members[k] = 0
+    #
+    # Convert pre 2.2 topics regexps whice were compiled in verbose
mode
+    # to a single line, non-verbose equivalent.
+    #
+    if stored_state.data_version <= 97 and hasattr(stored_state,
'topics')\
+            and stored_state.topics:
+        l.topics = []
+        for name, pattern, description, emptyflag in
stored_state.topics:
+            pattern = Utils.strip_verbose_pattern(pattern)
+            l.topics.append((name, pattern, description, emptyflag))
 
 
 
I'm not 100% comfortable with this for two reasons. The first is that
this is apparently the only thing in versions.py which will reference
an actual data_version. It seems OK, but I wonder if there is a better
way.

The other discomfort is list admins may be confused by the fact that
their nicely commented, pretty, verbose topic regexp turned into an
uncommented, ugly, non-verbose regexp. I actually think this is the
least evil way to address this problem, but I'm still not 100%
comfortable with it.

Thoughts, comments, other ideas?

-- 
Mark Sapiro <msapiro at value.net>       The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Developers mailing list