[Email-SIG] Ensuring 7 bit encoding
R. David Murray
rdmurray at bitdance.com
Fri Aug 28 03:45:39 CEST 2009
On Thu, 27 Aug 2009 at 17:42, Mark Sapiro wrote:
> Nicholas Cole wrote:
>>
>> What do I need to do to ensure that emails are generated only in 7,
>> not 8-bit encodings? I assume that I need to use
>> email.charset.add_charset , but can't quite work out what incantation
>> to give it. Does anyone have any pointers?
>
>
> I'm not sure what it is you're asking. Does this answer your question?
>
>>>> import email.message
>>>> m = email.message.Message()
>>>> m.set_payload("""A few lines
> ... of 7-bit text
> ...
> ... No high bit characters.
> ... """, 'us-ascii')
>>>> print m.as_string()
> MIME-Version: 1.0
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
>
> A few lines
> of 7-bit text
>
> No high bit characters.
>
>>>>
It probably doesn't, since if that message contains high range
characters it will result in an encoding of 8bit:
>>> import email.message
>>> m = email.message.Message()
>>> m.set_payload("""A few lines
... of 8-bit text
...
... One high bit character: ².
... """, 'us-ascii')
>>> print m.as_string()
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 8bit
A few lines
of 8-bit text
One high bit character: ².
>>>
Since 8bit isn't technically us-ascii, I wonder if this is a bug.
With a little experiement and a look at the code, it appears that you
will get 7bit clean output as long as you always provide a charset
for the input other than us-ascii that the charset module has been
told should be encoded using QP or BASE64 (which is true for
all of the already registered charsets).
EG: this results in 7bit clean output:
i>>> import email.message
>>> m = email.message.Message()
>>> m.set_payload("""A few lines
... of 8-bit text
...
... One high bit character: ².
... """, 'latin-1')
>>> print m.as_string()
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
A few lines
of 8-bit text
One high bit character: =C2=B2.
>>>
I suspect this is not a complete answer to the question...
--David
More information about the Email-SIG
mailing list