Mailman 3 Re: [Mailman-Users] regexp help - Mailman-Users

newer
Re: [Mailman-Users] Mailman-Users...

Re: [Mailman-Users] regexp help

older
GNU Mailman roadmap

Savoy, Jim

21 Oct 2009 21 Oct '09

5:54 p.m.

...

CNulk writes:

...

Hi Jim,

...

I may be completely wrong (heck, wouldn't be the first time), but why not have your "unique.name" address be an alias to a simple bash/perl/etc. script which simply accepts the email message, rewrites the message to be from the unique.name address, and sends it on to the list. When the list sees the message, it will appear to be "from" unique.name and the standard mailman moderation and message acceptance rules would apply. I had to do something similar for a different email distribution package to get around a software limitation. It should work unless you really need the from address to be from the actual sender.

Yep - I could do that. Have exim rewrite the headers before it sends out the message, but the people on the mailing list do want to know who the email originally came from. They just don't want everyone to have access to their mailing list, just this one account (an inquiries-type of account).

I just re-read Mark's comments, and he says:

So assuming that what you want is to bypass the other header_filter_rules, you need to "add new item" before the current rule 1 so the new rule becomes #1. Then the new rule 1 regexp should be

^from:.*(\s|<)some\.person\.name@gmail\.com(\s|>|$)

and the action Accept.

and that is exactly what I want to do - bypass the header_filter_rules. But I am afraid I don't quite understand this advice, Mark. How can I make this spam rule supercede other rules (like the one that says the list is moderated). There is no "add new item" button when you haven't got any spam rules. That only comes up when after I create me first rule, which is not superceding the moderation rule.

I should probably not say anything else until Mark logs in. He's probably looking at this huge chain and wanting to crack our skulls together like Moe.

jim -

Show replies by date

Geoff Shang

21 Oct 21 Oct

6:05 p.m.

New subject: regexp help

On Wed, 21 Oct 2009, Savoy, Jim wrote:

...

Yep - I could do that. Have exim rewrite the headers before it sends out the message, but the people on the mailing list do want to know who the email originally came from. They just don't want everyone to have access to their mailing list, just this one account (an inquiries-type of account).

Ok, maybe I'm missing the point here. couldn't you just allow all non-member postings and manually approve subscriptions? Define the address as an alternate name for the list if you really don't want to give out the list address.

On one site I administer, we've converted our simple Email forwarder public Email address to a mailing list, with the project managers as the ownly subscribers. Apart from the odd bit of fun we have with Reply-to, it works pretty well. It helps to fascilitate discussion between the managers and encourages managers to copy everyone else on messages sent back to people outside the project.

Geoff.

Mark Sapiro

10:19 p.m.

New subject: regexp help

Savoy, Jim wrote:

...

I just re-read Mark's comments, and he says:

So assuming that what you want is to bypass the other header_filter_rules, you need to "add new item" before the current rule 1 so the new rule becomes #1. Then the new rule 1 regexp should be

^from:.*(\s|<)some\.person\.name@gmail\.com(\s|>|$)

and the action Accept.

and that is exactly what I want to do - bypass the header_filter_rules.

No, that is not what you want to do. What you want to do is bypass other non-header_filter_rules holds (i.e. post from non-member) with a header_filter_rule. You can't do that.

You can do it with accept_these_nonmembers, but not in your case, because accept_these_nonmembers only works on the From: (or other 'sender' header, not on the To:.

...

But I am afraid I don't quite understand this advice, Mark. How can I make this spam rule supercede other rules (like the one that says the list is moderated).

You can't.

...

There is no "add new item" button when you haven't got any spam rules. That only comes up when after I create me first rule, which is not superceding the moderation rule.

I should probably not say anything else until Mark logs in. He's probably looking at this huge chain and wanting to crack our skulls together like Moe.

It's not your thread - it's the two week backlog of just approved posts. :)

What you need is a custom handler. See the FAQ at <http://wiki.list.org/x/l4A9> for how to install one. In your case, the handler is very simple - just 9 lines.

import re cre = re.compile('unique\.name', re.IGNORECASE) def process(mlist, msg, msgdata): if mlist.internal_name <> 'abc-l': return if cre.search(msg.get('to', '')): msgdata['approved'] = 1 # Used by the Emergency module msgdata['adminapproved'] = 1

Of course, you adjust the regexp 'unique\.name' and the list name 'abc-l' to suit. The handler needs to be in the pipeline before Moderate.

If you make a list specific pipeline for just this list, you can leave out the

if mlist.internal_name &lt;> 'abc-l':
    return

What this does is nothing if the list isn't abc-l. If it is abc-l and if the contents of the To: header of the message matches the regexp in re.compile() case insensitively, then the approved and adminapproved flags will be set in the message metadata and the message won't be subject to any holds.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

liste yoneticisi

2 Nov 2 Nov

7:04 a.m.

New subject: Couldn't change newlist.txt for tr version

Hi;

I tried to make a global change for newlist welcome message.

For English version, I changed the file; /cwis/htdocs/mailman/templates/en/newlist.txt

and it worked. ie the changes effected in the newlist information file.

But the similar changes didn't affected for /cwis/htdocs/mailman/templates/tr/newlist.txt file,

ie for Turkish new lists, list owner receives older version of newlist.txt, the file that doesn't exist anywhere.

Can you please tell me what is wrong about it?

Mailman Version is 2.1.19

Mark Sapiro

9:12 a.m.

New subject: Couldn't change newlist.txt for tr version

liste yoneticisi wrote:

...

I tried to make a global change for newlist welcome message.

For English version, I changed the file; /cwis/htdocs/mailman/templates/en/newlist.txt

and it worked. ie the changes effected in the newlist information file.

But the similar changes didn't affected for /cwis/htdocs/mailman/templates/tr/newlist.txt file,

ie for Turkish new lists, list owner receives older version of newlist.txt, the file that doesn't exist anywhere.

Can you please tell me what is wrong about it?

In some cases you have to restart Mailman after making template changes because templates are cached in a qrunner, but this normally affects only templates related to archives and shouldn't be required here, so I don't know what could be wrong.

However, assuming /cwis/htdocs/mailman/ is the Mailman installation $prefix directory, you should not edit the files you edited, because if you ever upgrade, your changes will be overwritten. To create sitewide edited templates, you should create the directories /cwis/htdocs/mailman/templates/site/tr/ and /cwis/htdocs/mailman/templates/site/en/ and put your edited templates there. That way, they will not be overwritten in an upgrade. See the FAQ at <http://wiki.list.org/x/jYA9>.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Savoy, Jim

4:17 p.m.

New subject: regexp help

...

Mark Sapiro wrote:

...

What you need is a custom handler. See the FAQ at <http://wiki.list.org/x/l4A9> for how to install one.

Thank you. Done.

...

In your case, the handler is very simple - just 9 lines.

...

import re cre = re.compile('unique\.name', re.IGNORECASE) def process(mlist, msg, msgdata): if mlist.internal_name <> 'abc-l': return if cre.search(msg.get('to', '')): msgdata['approved'] = 1 # Used by the Emergency module msgdata['adminapproved'] = 1

...

Of course, you adjust the regexp 'unique\.name' and the list name 'abc-l' to suit. The handler needs to be in the pipeline before Moderate.

OK - I made a file called Foo.py and put it in /Mailman/Handlers. I then inserted this module right before 'Moderate' in the pipeline (I editted Defaults.py for this - just as a temporary measure to see if it would work). I will remove this and add a line to mm_cfg.py later.

I then stopped/started the Mailman processes (not sure if that's necessary, but I did it anyway). Now the test email I sent is stuck in the shunt queue and this is in the errors log:

File "/Mailman/Handlers/Foo.py", line 2 cre = re.compile('unique\.name', re.IGNORECASE) def process(mlist, msg, msgdata): ^ SyntaxError: invalid syntax

It didn't line up well in this email message, but the carat (^) is positioned under the 'f' in the "def".

I know diddly about Python (and not that much more about Mailman, really) so I'm not sure how to fix the problem. Any ideas? Thanks!

jim -

Mark Sapiro

4:50 p.m.

New subject: regexp help

Savoy, Jim wrote:

...

...
Mark Sapiro wrote:

...
What you need is a custom handler. See the FAQ at <http://wiki.list.org/x/l4A9> for how to install one.

Thank you. Done.

...
In your case, the handler is very simple - just 9 lines.

...
import re cre = re.compile('unique\.name', re.IGNORECASE) def process(mlist, msg, msgdata): if mlist.internal_name <> 'abc-l': return if cre.search(msg.get('to', '')): msgdata['approved'] = 1 # Used by the Emergency module msgdata['adminapproved'] = 1

...
Of course, you adjust the regexp 'unique\.name' and the list name 'abc-l' to suit. The handler needs to be in the pipeline before Moderate.

OK - I made a file called Foo.py and put it in /Mailman/Handlers. I then inserted this module right before 'Moderate' in the pipeline (I editted Defaults.py for this - just as a temporary measure to see if it would work). I will remove this and add a line to mm_cfg.py later.

I then stopped/started the Mailman processes (not sure if that's necessary, but I did it anyway). Now the test email I sent is stuck in the shunt queue and this is in the errors log:

File "/Mailman/Handlers/Foo.py", line 2 cre = re.compile('unique\.name', re.IGNORECASE) def process(mlist, msg, msgdata): ^ SyntaxError: invalid syntax

You used some kind of word processor to create foo.py that concatenated lines 2 and 3 into a single line. Your Foo.py file must be just like my original example with lines 1, 2 and 3 at the left margin, lines 4 and 6 indented 4 spaces and lines 5, 7, 8 and 9 indented 8 spaces.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Savoy, Jim

5:29 p.m.

New subject: regexp help

Mark Sapiro wrote:

...

You used some kind of word processor to create foo.py that concatenated lines 2 and 3 into a single line. Your Foo.py file must be just like my original example with lines 1, 2 and 3 at the left margin, lines 4 and 6 indented 4 spaces and lines 5, 7, 8 and 9 indented 8 spaces.

These words you are saying are all true. I just "cut" your code in Outlook and "pasted" it in vi. I will try it again with the indenting you suggested (reminds me of Fortran!). Thanks.

jim -

Mark Sapiro

5:53 p.m.

New subject: regexp help

Savoy, Jim wrote:

...

...
You used some kind of word processor to create foo.py that concatenated lines 2 and 3 into a single line. Your Foo.py file must be just like my original example with lines 1, 2 and 3 at the left margin, lines 4 and 6 indented 4 spaces and lines 5, 7, 8 and 9 indented 8 spaces.

These words you are saying are all true. I just "cut" your code in Outlook and "pasted" it in vi. I will try it again with the indenting you suggested (reminds me of Fortran!). Thanks.

Depending on the options set in vi, it can do horrible things to indentation when you paste things in :(

Python is not at all like Fortran, In Fortran (at least through Fortran IV - I never did much with Fortran 77 and nothing beyond that) white space except for line endings is totally insignificant. True, you have some formatting restrictions like positions 1 - 5 for statement numbers, 6 for continuation and 7 - 72 for statements (although some compilers relaxed these), but consider that the compiler's parser/tokenizer doesn't know whether

  do 5 i = 1, 10

is a do loop or an assignment to a variable named do5i until it gets to the comma.

In Python, whitespace is of utmost significance. You either love it or hate it, but block structure is based entirely on indentation.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Savoy, Jim

6:05 p.m.

New subject: regexp help

...

Depending on the options set in vi, it can do horrible things to indentation when you paste things in :(

I just looked at your original posting (using Outlook) and line 3 is not indented, but rather continuous from line 2, and the other indents are in columns 5 and 9 (not 4 and 8). I shall try viewing it with other mail clients, just for kicks.

But now that I know that whitespace is critical in Python, I will be more careful.

...

Python is not at all like Fortran, In Fortran (at least through Fortran IV - I never did much with Fortran 77 and nothing beyond that) white space except for line endings is totally insignificant. True, you have some formatting restrictions like positions 1 - 5 for statement numbers, 6 for continuation and 7 - 72 for statements (although some compilers relaxed these), but consider that the compiler's parser/tokenizer doesn't know whether

I was loosely referring to Fortran reserving certain columns for certain things. I barely remember it at all (that was back in my pre-Commodore64 (PEEKin' and POKEin' anyone?), punched-hole card days).

jim -

Mark Sapiro

6:30 p.m.

New subject: regexp help

...

I just looked at your original posting (using Outlook) and line 3 is not indented, but rather continuous from line 2, and the other indents are in columns 5 and 9 (not 4 and 8). I shall try viewing it with other mail clients, just for kicks. Right - column 5 IS indented 4 spaces from column 1 and column 9 is indented 8 spaces from column 1 which is what I meant. And line 3 should not be indented, but also not be joined to the end of line 2 which is apparently the only real problem with your file.

Look at my original in the archive at <http://mail.python.org/pipermail/mailman-users/2009-October/067496.html> or even as quoted in your reply at <http://mail.python.org/pipermail/mailman-users/2009-November/067555.html>.

Both of those show proper formatting. I guess that's just one more thing we can't trust Outlook to do.

BTW, indentation in steps of 4 is only a convention, it isn't mandatory. The indentation could be 1 space and 2 spaces or 1 tab and 2 tabs or even 4 spaces and 1 tab, although mixing spaces and tabs is dangerous and highly frowned upon. However, in any case, the highest level (the first 3 lines in the example) must not be indented at all.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Chr. von Stuckrad

3 Nov 3 Nov

6:03 a.m.

New subject: regexp help (OT: Edit and Mail)

On Mon, 02 Nov 2009, Savoy, Jim wrote:

...

...
Depending on the options set in vi, it can do horrible things to indentation when you paste things in :(

:-) seen that! Therefore modern vims have :set paste<enter> and as long as you not 'set nopaste' *no* munging of pastes will be done! I'm needing/using that all the time, I might even put it into my '.vimrc' and make it default ...

...

I just looked at your original posting (using Outlook) and line 3 is not indented, but rather continuous from line 2, and the other indents are in columns 5 and 9 (not 4 and 8). I shall try viewing it with other mail clients, just for kicks.

'outlook' is a <insert favorite expletive here> for programmers. By Default it *reformats* everything to 'Paragraphs' of the form: linebreak for mailtransfer', then it collects all those lines and

everything NOT split by an empty line is 'supposed to be a useless

reformats the resulting words with single whitespaces to window-size.

an empty line means 'paragraph end', so itself may vanish anyway only the linebreak in a paragraph stays.

It's like Microsoft(office)Word's view of Text and you are supposed to write html or rtf anyway :-)

If hints(warnings, whatever) are on, you'll see a line above your munged mail, saying it removed useless newlines, and by clicking it you can get them back.

Stucki

Savoy, Jim

2 Nov 2 Nov

4:23 p.m.

New subject: regexp help

I also just noticed that the shunt queue started to fill up with messages for other lists as well, so I quickly removed the line I had inserted into Defaults.py, stopped/started the Mailman processes, and successfully unshunted everything. I was hoping the code would only affect the one list I am messing with, but I guess if there is a syntax error in it, it breaks the entire pipeline (maybe).

Mark Sapiro

4:53 p.m.

New subject: regexp help

Savoy, Jim wrote:

...

I also just noticed that the shunt queue started to fill up with messages for other lists as well, so I quickly removed the line I had inserted into Defaults.py, stopped/started the Mailman processes, and successfully unshunted everything. I was hoping the code would only affect the one list I am messing with, but I guess if there is a syntax error in it, it breaks the entire pipeline (maybe).

Because your handler is in the GLOBAL_PIPELINE and it has a syntax error, every message encounters the SyntaxError exception and the message is shunted.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Savoy, Jim

4:35 p.m.

New subject: regexp help

I also just noticed that all of the other handlers have an accompanying .pyc file, but my Foo.py does not. Perhaps that 'c' stands for "compiled" and I was supposed to compile the code first? (probably seems obvious to someone familiar with Mailman/Python).

Geoff Shang

4:44 p.m.

New subject: regexp help

On Mon, 2 Nov 2009, Savoy, Jim wrote:

...

I also just noticed that all of the other handlers have an accompanying .pyc file, but my Foo.py does not. Perhaps that 'c' stands for "compiled"

It does.

...

and I was supposed to compile the code first? (probably seems obvious to someone familiar with Mailman/Python).

No. The code is automatically compiled if there is no pyc file or the py file is different.

Presumably there's no pyc file because of your syntax error. At a guess, I'd say that the def statement should be on a new line, but I'm not a Python expert.

Geoff.

Savoy, Jim

3 Nov 3 Nov

4:05 p.m.

New subject: regexp help

Hi Mark,

I got it to compile properly, but it is still not working. I made the following changes in Foo.py:

import re cre = re.compile('test.account', re.IGNORECASE) def process(mlist, msg, msgdata): if mlist.internal_name <> 'abc-l': return if cre.search(msg.get('to', '')): msgdata['approved'] = 1 # Used by the Emergency module msgdata['adminapproved'] = 1

Goal: The account test.account@uleth.ca is set to forward mail to the mailing list abc-l@uleth.ca, which should accept it, regardless of who sent it to test.account@uleth.ca (all other mail to this list from non-members will be rejected).

You also wrote:

...

if the contents of the To: header of the message matches the regexp in re.compile() case insensitively, then the approved and adminapproved flags will be set in the message metadata and the message won't be subject to any holds.

So when you say "matches the regexp" do you mean "exactly" matches? And if so, would your regexp work? Or do I need a more specific or accompanying regexp in the re.compile statement? eg

cre = re.compile('test.account@uleth.ca', re.IGNORECASE)

Thanks.

jim -

Mark Sapiro

5 p.m.

New subject: regexp help

Savoy, Jim wrote:

...

Hi Mark,

I got it to compile properly, but it is still not working. I made the following changes in Foo.py:

import re cre = re.compile('test.account', re.IGNORECASE) def process(mlist, msg, msgdata): if mlist.internal_name <> 'abc-l': return if cre.search(msg.get('to', '')): msgdata['approved'] = 1 # Used by the Emergency module msgdata['adminapproved'] = 1

Goal: The account test.account@uleth.ca is set to forward mail to the mailing list abc-l@uleth.ca, which should accept it, regardless of who sent it to test.account@uleth.ca (all other mail to this list from non-members will be rejected).

Did you put 'Foo' back in the GLOBAL_PIPELINE prior to 'Moderate' and restart Mailman?

What happens when you mail to test.account? Is the mail rejected by Mailman? Does the To: header in the mail in the reject notice contain 'test.account'?

...

You also wrote:

...
if the contents of the To: header of the message matches the regexp in re.compile() case insensitively, then the approved and adminapproved flags will be set in the message metadata and the message won't be subject to any holds.

So when you say "matches the regexp" do you mean "exactly" matches? And if so, would your regexp work?

Since that regexp 'test.account' is not anchored and is searched for by the re.search() method, it means if the string 'test' in any combination of upper/lower case followed by any single character (the . matches any character) followed by the string 'account' in any combination of upper/lower case is in the To: header, it will match.

...

Or do I need a more specific or accompanying regexp in the re.compile statement? eg

cre = re.compile('test.account@uleth.ca', re.IGNORECASE)

You could do that, or even 'test\.account@uleth\.ca' or other, even more restrictive tests on the To: header, but what are the chances of some mail being delivered to the abc-l list from a non-member with 'test.account' somewhere in the To: header, if it wasn't sent to the proper test.account@uleth.ca address?

Some non-list member could mail

To: <abc-l@...>, <my-test-account@example.com>

and the regexp 'test.account' would accept that mail. If you're concerned about this, you could use a regexp like

'(^|[\s<])test\.account@uleth\.ca($|[\s>])'

to require the exact address test.account@uleth.ca delimited by white space, angle brackets or the start and end of the string, but there are probably better ways of doing this such as using email utilities to parse the To: header for all addresses and then testing to see that 'test.account@uleth.ca' is there and 'abc-l@...' is not, but all that seems unnecessary unless you're going to look at Received: headers too.

I.e., if I'm trying to fool you, I can always create a message To: test.account@uleth.ca and send it directly to the abc-l list.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Savoy, Jim

5:24 p.m.

New subject: regexp help

Mark Sapiro wrote:

...

Did you put 'Foo' back in the GLOBAL_PIPELINE prior to 'Moderate' and restart Mailman?

I did.

...

What happens when you mail to test.account? Is the mail rejected by Mailman? Does the To: header in the mail in the reject notice contain 'test.account'?

Yes, it is rejected. And the To: header does not come from test.account but rather from the actual sender. Inside of what looks like an Exchange attachment, I can see the full original message, with the To: header displaying test.account. So it looks like the Exchange server may be wrapping up the original message and obscuring the headers.

...

Since that regexp 'test.account' is not anchored and is searched for by the re.search() method, it means if the string 'test' in any combination of upper/lower case followed by any single character (the . matches any character) followed by the string 'account' in any combination of upper/lower case is in the To: header, it will match.

OK - that's close enough for me. I don't really need to be that specific anyway because my test.account (not the real name of the account) is quite a unique and unusual name.

I have just received word from the owners of this list that they no longer care about me doing this (they have just opened the list up to anyone) so I probably won't spend too much more time on it now, especially if the Exchange server (which I don't have access to) is obscuring the original headers from Mailman. Since we have exim as a front-end to Mailman, I can probably just do some sort of a re-write in there instead.

But thanks anyway. It was an interesting foray into Python for me!

jim -

Mark Sapiro

5:40 p.m.

New subject: regexp help

Savoy, Jim wrote:

...

Mark Sapiro wrote:

...
Did you put 'Foo' back in the GLOBAL_PIPELINE prior to 'Moderate' and restart Mailman?

I did.

...
What happens when you mail to test.account? Is the mail rejected by Mailman? Does the To: header in the mail in the reject notice contain 'test.account'?

Yes, it is rejected. And the To: header does not come from test.account but rather from the actual sender.

That would be the To: header of the reject notice.

...

Inside of what looks like an Exchange attachment, I can see the full original message, with the To: header displaying test.account. So it looks like the Exchange server may be wrapping up the original message and obscuring the headers.

Mailman sends a multipart/mixed message with two parts - a text/plain part containing the reject reason and a message/rfc822 part containing the post as received by Mailman. It is the message in this message/rfc822 part that is what Mailman saw. If that is not the original post, but somehow got wrapped by Exchange in the forwarding process, you'll have to take that into account.

-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Savoy, Jim

4 Nov 4 Nov

11:19 a.m.

New subject: regexp help

...

Mark Sapiro wrote:

...

That would be the To: header of the reject notice.

Yes. That is the message I am analyzing.

...

Mailman sends a multipart/mixed message with two parts - a text/plain part containing the reject reason and a message/rfc822 part containing the post as received by Mailman. It is the message in this message/rfc822 part that is what Mailman saw. If that is not the original post, but somehow got wrapped by Exchange in the forwarding process, you'll have to take that into account.

Got it. Under the message/rfc822 part, the To: header says test.account@uleth.ca.

So in theory, it should accept this message. I will re-analyze everything to make sure there are no typos. Thanks.

jim -

5483

Age (days ago)

5497

Last active (days ago)

List overview

Download

20 comments

5 participants

participants (5)

Chr. von Stuckrad
Geoff Shang
liste yoneticisi
Mark Sapiro
Savoy, Jim

Re: [Mailman-Users] regexp help

Savoy, Jim

and the action Accept.

and the action Accept.

Savoy, Jim

Savoy, Jim

Savoy, Jim

Savoy, Jim

Savoy, Jim

Savoy, Jim

Savoy, Jim

Savoy, Jim

tags

participants (5)