Mailman 3 Quoting problem in 2.0 - Mailman-Developers

Quoting problem in 2.0

older
Mailing List Subscription Request...

David Champion

3 Jan 2001 3 Jan '01

9:23 a.m.

There's a problem triggered when list admins enter addresses with double quotes: "foo@bar.biz" instead of foo@bar.biz

These addresses are added - the quotes aren't stripped off - but they cannot be removed via the web interface, because then the quotes ARE stripped off. This is true for my 2.0 release installation.

In my 1.0 rc2 installation, there's a further complication: it appears from the HTML output that mailman tried to subscribe all addresses in the current chunk.

The quoted addresses can be removed via remove_members.

I'll see about a patch for 2.0, but since Barry's talking about 2.0.1 I wanted to mention the problem quickly in case the solution is easy enough to throw in the pot. Seems that it would be.

-- -D. dgc@uchicago.edu NSIT University of Chicago

Show replies by date

David Champion

3 Jan 3 Jan

9:26 a.m.

On 2001.01.03, in 20010103032334.Y4909@smack.uchicago.edu, "David Champion" dgc@uchicago.edu wrote:

...

There's a problem triggered when list admins enter addresses with double quotes: "foo@bar.biz" instead of foo@bar.biz

These addresses are added - the quotes aren't stripped off - but they cannot be removed via the web interface, because then the quotes ARE stripped off. This is true for my 2.0 release installation.

In my 1.0 rc2 installation, there's a further complication: it appears from the HTML output that mailman tried to subscribe all addresses in

s/subscribe/un&

...

the current chunk.

The quoted addresses can be removed via remove_members.

I'll see about a patch for 2.0, but since Barry's talking about 2.0.1 I wanted to mention the problem quickly in case the solution is easy enough to throw in the pot. Seems that it would be.

-- -D. dgc@uchicago.edu NSIT University of Chicago

Mailman-Developers mailing list Mailman-Developers@python.org http://www.python.org/mailman/listinfo/mailman-developers

-- -D. dgc@uchicago.edu NSIT University of Chicago

barry＠digicool.com

4:04 p.m.

...

...
...
...
...
"DC" == David Champion writes:

DC> There's a problem triggered when list admins enter addresses DC> with double quotes: "foo@bar.biz" instead of foo@bar.biz DC> These addresses are added - the quotes aren't stripped off - DC> but they cannot be removed via the web interface, because then DC> the quotes ARE stripped off. This is true for my 2.0 release DC> installation. DC> In my 1.0 rc2 installation, there's a further complication: it DC> appears from the HTML output that mailman tried to subscribe DC> all addresses in the current chunk. DC> The quoted addresses can be removed via remove_members. DC> I'll see about a patch for 2.0, but since Barry's talking DC> about 2.0.1 I wanted to mention the problem quickly in case DC> the solution is easy enough to throw in the pot. Seems that DC> it would be. You're right it is a simple fix, see below. -Barry Index: admin.py =================================================================== RCS file: /cvsroot/mailman/mailman/Mailman/Cgi/admin.py,v retrieving revision 1.82 diff -u -r1.82 admin.py --- admin.py 2000/09/29 00:05:04 1.82 +++ admin.py 2001/01/03 16:02:26 @@ -22,6 +22,7 @@ import cgi import string import types +import rfc822 from Mailman import Utils from Mailman import MailList @@ -835,10 +836,13 @@ # # mass subscription processing for members category # + def clean_names(name): + return rfc822.unquote(string.strip(name)) + if cgi_info.has_key('subscribees'): name_text = cgi_info['subscribees'].value name_text = string.replace(name_text, '\r', '') - names = filter(None, map(string.strip, string.split(name_text, '\n'))) + names = filter(None, map(clean_names, string.split(name_text, '\n'))) send_welcome_msg = string.atoi( cgi_info["send_welcome_msg_to_this_batch"].value) digest = 0

Georg Mischler

6:05 p.m.

Barry A. Warsaw wrote:

...

You're right it is a simple fix, see below.

While you're at it... The following is also a simple fix, which eliminates at least some of the "inexplicable" failures to create HTML archives: --- Mailbox.py Wed Jan 3 18:43:59 2001 +++ Mailbox.py Mon Dec 18 18:59:33 2000 @@ -27,7 +27,7 @@ class Mailbox(mailbox.UnixMailbox): # a better regexp than the Python 1.5.2 default _fromlinepattern = r'From \s*\S+\s+\w\w\w\s+\w\w\w\s+\d\d?\s+' \ - r'\d\d?:\d\d(:\d\d)?(\s+\S+)?\s+\d\d\d\d\s*$' + r'\d\d?:\d\d(:\d\d)?(\s+\S+)?\s+[+-]?\d\d\d\d\s*$' _regexp = re.compile(_fromlinepattern) def _isrealfromline(self, line): The rationale for this change is posted here: http://mail.python.org/pipermail/mailman-developers/2000-December/003518.htm... I discovered the web based CVS-access in the mean time, so I was now able to create a real patch against the most recent version. Have fun! -schorsch -- Georg Mischler -- simulations developer -- schorsch at schorsch.com +schorsch.com+ -- lighting design tools -- http://www.schorsch.com/

barry＠digicool.com

10:27 p.m.

...

...
...
...
...
"GM" == Georg Mischler schorsch@schorsch.com writes:

GM> Barry A. Warsaw wrote:

>>  You're right it is a simple fix, see below.

GM> While you're at it... The following is also a simple fix,
GM> which eliminates at least some of the "inexplicable" failures
GM> to create HTML archives:

This doesn't seem right. From the referenced archive message, the change is supposed to add a hit for negative timezones, but that's /not/ what those last three \d's are trying to match. They're trying to match a four-digit year. It makes no sense to add an optional sign to the year matching field.

If we wanted to be really correct about this, _isrealfromline() always return true because the pre-test already does the Right Thing in matching exactly line.startswith('From ') -- see [1]. I'd make that change, but it worries me because some earlier versions of Mailman did not properly >-mangle embedded From_ lines, so it /could/ break existing archives even worse. I'm not comfortable with this as a patch for 2.0.1.

-Barry

[1] http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html

Georg Mischler

11:12 p.m.

Barry A. Warsaw wrote:

...

...
...
...
...
...
"GM" == Georg Mischler schorsch@schorsch.com writes:
GM> Barry A. Warsaw wrote:

>>  You're right it is a simple fix, see below.

GM> While you're at it... The following is also a simple fix,
GM> which eliminates at least some of the "inexplicable" failures
GM> to create HTML archives:
This doesn't seem right. From the referenced archive message, the change is supposed to add a hit for negative timezones, but that's /not/ what those last three \d's are trying to match. They're trying to match a four-digit year. It makes no sense to add an optional sign to the year matching field.

As happens once in a while, I'm slightly confused now. My own experiments demonstrated to me that this change removes the problem, or at least that's what I think they demonstrated.

The following is a typical "From " line as I often encounter them:

...

From schorsch@schorsch.com Thu Jun 10 13:09:41 1999 -0400

...

From my understanding, this matches to the pattern like follows (whitespace inserted for clarity):

'From \s* \S+ \s+ \w\w\w \s+ \w\w\w \s+ \d\d? \s+ 'From schorsch@schorsch.com Thu Jun 10

\d\d?:\d\d(:\d\d)? (\s+ \S+)? \s+ [+-]? \d\d\d\d \s *$' 13 :09 :41 1999 - 0400 '

I must admit that I'm not completely sure why the \S (non whitespace) matching the year is grouped together with the preceding whitespace (hmmm... the year is probably optional?). But in any case, unless I'm missing something crucial, the above interpretation confirms my experiments exactly to the point.

In my experience, the above real-life "From " header is matched by the modified pattern, but not by the original one. Thinking about it, it seems that the last 4*\d group matches *either* the year, *or* the time zone, depending on the existence of one of them. But even if this is the case, it shouldn't be a problem (except for potentially matching a negative year...)

If it is actually the timezone that is optional, then the grouping might rather be needed there instead of with the year. Or is the pattern meant to match a line where the timezone comes before the year? We'd need to allow for both possibilities then. My suggestion to use the very robust parsedate_tz() function from rfc822.py instead starts to make more and more sense to me.

Or am I hallucinating beyond repair here?

-schorsch

-- Georg Mischler -- simulations developer -- schorsch at schorsch.com +schorsch.com+ -- lighting design tools -- http://www.schorsch.com/

8508

Age (days ago)

8508

Last active (days ago)

List overview

Download

5 comments

3 participants

participants (3)

barry＠digicool.com
David Champion
Georg Mischler

Quoting problem in 2.0

David Champion

David Champion

barry＠digicool.com

Georg Mischler

barry＠digicool.com

Georg Mischler

tags

participants (3)