[Email-SIG] email.header.decode_header eats my spaces

Barry Warsaw barry at python.org
Tue Mar 27 15:39:57 CEST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Tokio,

On Mar 27, 2007, at 3:06 AM, Tokio Kikuchi wrote:

> In my opinion (may not be true to RFC2822 in detail), ascii strings  
> in header object should be strip()ped and separated by FWS  
> (including '\r\n ' or '\r\n\t').

I actually think we should be doing the opposite, namely preserving  
any FWS in the existing text and /not/ substituting continuation_ws  
for it when we re-break the headers.  This is the only way to  
maintain idempotency short of saving the original header intact (but  
then memory usage doubles).  continuation_ws should be used only when  
we're forced to break at a non-existing FWS location, e.g. if we've  
split a non-ascii header or at a non-whitespace header-specific  
syntactic break.  In the case of RFC 2047 headers, the FWS gets  
consumed anyway so it isn't idempotentially (?!) significant.

That's where my patch is headed anyway.  I have one test case failure  
left to resolve.  It's a bear, but when I get that working I'll  
submit a patch for review.  My gut is telling me not to apply this to  
Python 2.5 but only Python 2.6 since enough of the semantics of  
continuation_ws and folding has changed that it isn't appropriate for  
a patch release.

Cheers,
- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRgkernEjvBPtnXfVAQILlQP/ehJ6raVYLZwd1Pb8ZIuq2+KkGM04JsDd
WwHw1mbfijHaft00bKa7j7dQK9XewicDW9cAuOEQ1SgzfOCOWO+EodHdGbTq3he1
rNlaRQZ2MFaCmQLWYwbwv2zkogu0m9tpSRupwlcdoOzYMNJb0KhLQiVb3GCMHx45
I2IgJdkFV2s=
=XzA4
-----END PGP SIGNATURE-----


More information about the Email-SIG mailing list