[Email-SIG] fixing the current email module

Glenn Linderman v+python at g.nevcal.com
Fri Oct 9 22:43:59 CEST 2009


On approximately 10/9/2009 1:38 AM, came the following characters from 
the keyboard of Tokio Kikuchi:
> Glenn Linderman wrote:
>   
>> On approximately 10/8/2009 8:47 PM, came the following characters from
>> the keyboard of Tokio Kikuchi:
>>     
>>>>> Actually, as long as the prepended text is ASCII, all that work can be
>>>>> done on the encoded value.  When it is not ASCII, it may still be
>>>>> separated and recognizable.  Still that logic is more complex than
>>>>> decoding, handling as Unicode, and encoding.... when it works.  Just
>>>>> pointing out that there is more than one way to do things...       
>>>>>           
>>> Oh, really?
>>>
>>> Base64 is 3 to 4 octets encoding and there is no way to prepend padding.
>>>   
>>>       
>> In header values, encoding is done using encoded-words.  A header value
>> consists of a sequence of ASCII words, and encoded-words.  While an
>> encoded word, that uses base64 encoding cannot easily be adjusted to
>> prepend data into that encoded-word, additional ASCII or encoded-words
>> can be prepended in front of the other ASCII or encoded words within the
>> header-value.
>>
>> So, yes, really!
>>
>>     
> Following two lines have equivalent header contents:
>
> Re: [mmjp-users 123] =?iso-2022-jp?b?GyRCRnxLXDhsGyhC?=
> Re: =?iso-2022-jp?b?W21tanAtdXNlcnMgMTIzXSAbJEJGfEtcOGwbKEI=?=
>
> I'd like to see how you can extract ascii part without touching rest of
> the encoded word in the second example.
>   

I can't, and I didn't say I could.

> What we do in mailman is that both are treated equally and delete
> [mmjp-users 123] from the subject and prefix again by [mmjp-users 124]
> (with new sequential number).  Some MUA encode subjects like the second
> example and this is beyond our control.  Therefore, we are forced to
> decode the whole part of header content.
>   

Yes, if the MUA has created the second encoding, decoding is required in 
order to replace the header prefix.

If the MUA has created the first encoding, then decoding would not be 
required in order to replace the header prefix, but the logic to detect 
which case and handle them separately, results in more complexity in the 
application.

What I said, was that prefixing a header value with additional text 
didn't require decoding, and that is true.

What you are saying, is that you want to do more than prefix a header 
value with additional text.

What you are saying is that you would rather choose to keep the 
application logic simple, by assuming or requiring that the existing 
header value is able to be decoded.  If that is sufficient for your 
application, it is a reasonable choice.  What do you do with messages 
for which the header you wish to modify cannot be decoded?  Some options 
would be:

1) bounce the message

2) discard the message

3) determine if the header value is partially able to be decoded, and if 
the part that can be decoded contains the data you wish to modify, 
modify it, and simply preserve and pass-through the parts that could not 
be decoded.

4) if the header value cannot be at all decoded, or the parts that can 
be decoded do not contain the data you wish to modify, then you could 
possibly choose to simply prefix information into the header in that 
case, again preserving and passing through the parts that could not be 
decoded (or, in this case, the whole value).

Perhaps you can think of other alternatives besides these, feel free to 
suggest some.

Naturally, doing options 3 or 4 above requires more complex logic for 
the application than options 1 or 2.  The requirements of your 
application should determine the types of choices you make.

For example, if a new or non-standard charset appears, an application 
that requires the ability to decode the header, but hasn't been update 
to understand the charset, will fail to decode it.  Yet, if it has logic 
like 3 or 4, it may be more successful, and would be a more robust 
application.

-- 
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking



More information about the Email-SIG mailing list