Mailman 3 two spaces in subject lines - Mailman-Developers

newer
Re: [Mailman-Developers] Changing...

two spaces in subject lines

older
[PATCH] logo disabling, and extra...

Fil

12 Apr 2002 12 Apr '02

7:52 p.m.

Since I upgraded to have iso_xxx compliant subjects, I notice that most emails go through with TWO spaces after the usual subject_prefix, on all lists. I don't really mind, but just wanted to mention it.

-- Fil

Show replies by date

Fil

12 Apr 12 Apr

9:13 p.m.

...

Since I upgraded to have iso_xxx compliant subjects, I notice that most emails go through with TWO spaces after the usual subject_prefix, on all lists. I don't really mind, but just wanted to mention it.

Precisely, here's how it happens :

"Subject: =?iso-8859-1?q?=5Bspip-dev=5D_?="
" petite mise a jour =?iso-8859-1?Q?s=E9curi?="
"        =?iso-8859-1?Q?t=E9?= inc_auth_cookie"

(I've enclosed the lines with "" so there are no surprises)

-- Fil

Ben Gertzfield

13 Apr 13 Apr

2:50 a.m.

On Saturday, April 13, 2002, at 06:13 , Fil wrote:

...

...
Since I upgraded to have iso_xxx compliant subjects, I notice that most emails go through with TWO spaces after the usual subject_prefix, on all lists. I don't really mind, but just wanted to mention it.

Precisely, here's how it happens :
"Subject: =?iso-8859-1?q?=5Bspip-dev=5D_?="
" petite mise a jour =?iso-8859-1?Q?s=E9curi?="
"        =?iso-8859-1?Q?t=E9?= inc_auth_cookie"

This is an interesting side case. RFC 2047 says that between encoded words, whitespace is to be ignored; however, here, we have encoded words with US-ASCII in between them.

I think the email.Header package I wrote is doing the wrong thing here.
Either we need represent the whole thing as one or more encoded-words, or we need to be super anal about whitespace between encoded-words and non- encoded-words.

I am currently moving from Tokyo to California, but when I get back and settled I will take a long hard look at this issue. I agree that it's pretty important, and that email.Header is doing the wrong thing with respect to whitespace between encoded-words and non- encoded-words:

...

...
...
from email.Header import Header, decode_header from email.Charset import Charset f = Charset("iso-8859-1") z = Header("Zout alours!", f) z <email.Header.Header instance at 0x811b754> print z =?iso-8859-1?q?Zout_alours!?= z.append(" Hello?") print z =?iso-8859-1?q?Zout_alours!?= Hello? decode_header(z) [('Zout alours!', 'iso-8859-1'), ('Hello?', None)]

Here, the whitespace should *not* be disappearing in decode_header, and in fact there should only be one space between the encoded-word and "Hello?" in the printed-out header.

It's certainly a thinko in email.Header. I will work on this in a week or so..

Ben

Stephen J. Turnbull

5:59 a.m.

...

...
...
...
...
"Ben" == Ben Gertzfield <che@debian.org> writes:

Ben> I think the email.Header package I wrote is doing the wrong
Ben> thing here.

Yup.

Ben> Either we need represent the whole thing as one or more
Ben> encoded-words, or we need to be super anal about whitespace
Ben> between encoded-words and non- encoded-words.

The latter. What are you going to do with encodings you know nothing about, eg, if I send a message with

Subject: =?sjt-1?q?=49=74=27=73=20=6A=75=73=74=20=41=53=43=49=49=21?=

in it?

Ben> It's certainly a thinko in email.Header.

RFC-2822-parsing is a dirty job.

dewa, matane.

-- Institute of Policy and Planning Sciences http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Don't ask how you can "do" free software business; ask what your business can "do for" free software.

Ben Gertzfield

6:07 a.m.

On Saturday, April 13, 2002, at 02:59 , Stephen J. Turnbull wrote:

...

Ben> Either we need represent the whole thing as one or more
Ben> encoded-words, or we need to be super anal about whitespace
Ben> between encoded-words and non- encoded-words.
The latter. What are you going to do with encodings you know nothing about, eg, if I send a message with

Subject: =?sjt-1?q?=49=74=27=73=20=6A=75=73=74=20=41=53=43=49=49=21?=

in it?

Of course, we wouldn't do anything at all with that, regarding whitespace. I'm talking about when encoded-words and non- encoded-words are mixed together.

But I totally agree that we need to be anal; it's just hard to know ahead of time whether to encode the next space as part of an encoded-word, or as the space between an encoded-word and a non- encoded-word. But we certainly must not do both!

Ben

Stephen J. Turnbull

15 Apr 15 Apr

3:53 p.m.

...

...
...
...
...
"Ben" == Ben Gertzfield <che@debian.org> writes:

Ben> On Saturday, April 13, 2002, at 02:59 , Stephen J. Turnbull
Ben> wrote:

Ben> Either we need represent the whole thing as one or more
Ben> encoded-words, or we need to be super anal about whitespace
Ben> between encoded-words and non- encoded-words.
>> The latter.  What are you going to do with encodings you know
>> nothing about, eg, if I send a message with
>> 
>> Subject:
>> =?sjt-1?q?=49=74=27=73=20=6A=75=73=74=20=41=53=43=49=49=21?=
>> 
>> in it?

Ben> Of course, we wouldn't do anything at all with that,
Ben> regarding whitespace.  I'm talking about when encoded-words
Ben> and non- encoded-words are mixed together.

On this list it should end up looking like this:

Subject: [Mailman-Developers] =?sjt-1?q?=49=74=27=73=20=6A=75=73=74=20=41=53=43=49=49=21?=

or so, no? Urk, folded at the whitespace and all....

Ben Gertzfield

16 Apr 16 Apr

12:01 a.m.

On Tuesday, April 16, 2002, at 12:53 , Stephen J. Turnbull wrote:

...

On this list it should end up looking like this:

Subject: [Mailman-Developers] =?sjt-1?q?=49=74=27=73=20=6A=75=73=74=20=41=53=43=49=49=21?=

or so, no? Urk, folded at the whitespace and all....

Ah, I see what you're getting at. Yes, thanks. Another edge case..

Ben

8311

Age (days ago)

8315

Last active (days ago)

List overview

Download

6 comments

3 participants

participants (3)

Ben Gertzfield
Fil
Stephen J. Turnbull

two spaces in subject lines

Fil

Fil

Ben Gertzfield

Stephen J. Turnbull

Ben Gertzfield

Stephen J. Turnbull

Ben Gertzfield

tags

participants (3)