[Mailman-Developers] two spaces in subject lines

Ben Gertzfield che@debian.org
Sat, 13 Apr 2002 11:50:50 +0900


On Saturday, April 13, 2002, at 06:13 , Fil wrote:

>> Since I upgraded to have iso_xxx compliant subjects, I notice that most
>> emails go through with TWO spaces after the usual subject_prefix, on 
>> all
>> lists. I don't really mind, but just wanted to mention it.
>
> Precisely, here's how it happens :
>
>     "Subject: =?iso-8859-1?q?=5Bspip-dev=5D_?="
>     " petite mise a jour =?iso-8859-1?Q?s=E9curi?="
>     "        =?iso-8859-1?Q?t=E9?= inc_auth_cookie"

This is an interesting side case.  RFC 2047 says that between encoded 
words, whitespace is to be ignored; however, here, we have encoded words 
with US-ASCII in between them.

I think the email.Header package I wrote is doing the wrong thing here.  
Either we need represent the whole thing as one or more encoded-words, 
or we need to be super anal about whitespace between encoded-words and 
non- encoded-words.

I am currently moving from Tokyo to California, but when I get back and 
settled I will take a long hard look at this issue.  I agree that it's 
pretty important, and that email.Header is doing the wrong thing with 
respect to whitespace between encoded-words and non- encoded-words:

 >>> from email.Header import Header, decode_header
 >>> from email.Charset import Charset
 >>> f = Charset("iso-8859-1")
 >>> z = Header("Zout alours!", f)
 >>> z
<email.Header.Header instance at 0x811b754>
 >>> print z
=?iso-8859-1?q?Zout_alours!?=
 >>> z.append(" Hello?")
 >>> print z
=?iso-8859-1?q?Zout_alours!?=  Hello?
 >>> decode_header(z)
[('Zout alours!', 'iso-8859-1'), ('Hello?', None)]

Here, the whitespace should *not* be disappearing in decode_header, and 
in fact there should only be one space between the encoded-word and 
"Hello?" in the printed-out header.

It's certainly a thinko in email.Header.  I will work on this in a week 
or so..

Ben