[Email-SIG] email.header.decode_header eats my spaces
Tokio Kikuchi
tkikuchi at is.kochi-u.ac.jp
Wed Mar 28 02:06:49 CEST 2007
Barry Warsaw wrote:
> On Mar 27, 2007, at 3:06 AM, Tokio Kikuchi wrote:
>
>> In my opinion (may not be true to RFC2822 in detail), ascii strings in
>> header object should be strip()ped and separated by FWS (including
>> '\r\n ' or '\r\n\t').
>
> I actually think we should be doing the opposite, namely preserving any
> FWS in the existing text and /not/ substituting continuation_ws for it
> when we re-break the headers. This is the only way to maintain
> idempotency short of saving the original header intact (but then memory
> usage doubles). continuation_ws should be used only when we're forced
> to break at a non-existing FWS location, e.g. if we've split a non-ascii
> header or at a non-whitespace header-specific syntactic break. In the
> case of RFC 2047 headers, the FWS gets consumed anyway so it isn't
> idempotentially (?!) significant.
Well, this will surely break my contribution on Mailman 2.2
CookHeaders.py where unifying the code for subject prefix munging for
both ascii and rfc2047. :-(
Almost all the MUAs do subject munging by adding 'Re:' and adjusting the
header length. This direction of patching means Python email package
can't no more be used for eg. webmail application. If I understand
correctly of course.
>
> That's where my patch is headed anyway. I have one test case failure
> left to resolve. It's a bear, but when I get that working I'll submit a
> patch for review. My gut is telling me not to apply this to Python 2.5
> but only Python 2.6 since enough of the semantics of continuation_ws and
> folding has changed that it isn't appropriate for a patch release.
>
May be we should add a option for email.header.Header(), like
idempotent=Ture/False. ;-)
--
Tokio Kikuchi, tkikuchi at is.kochi-u.ac.jp
http://weather.is.kochi-u.ac.jp/
More information about the Email-SIG
mailing list