[Patches] [ python-Patches-549133 ] RFC 2231 support for email package

noreply@sourceforge.net noreply@sourceforge.net
Fri, 28 Jun 2002 22:50:59 -0700


Patches item #549133, was opened at 2002-04-26 11:24
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470

Category: Modules
Group: Python 2.3
>Status: Closed
>Resolution: Accepted
Priority: 5
Submitted By: Oleg Broytmann (phd)
Assigned to: Barry A. Warsaw (bwarsaw)
Summary: RFC 2231 support for email package

Initial Comment:
RFC 2231 defines the methods for encoding and decoding
parameters in mail headers.

This patch adds support for parameter decoding. It
changes the interface of Message._get_params_preserve()
- the function can return not only an ASCII string, but
also 3-tuple (charset, language, value). Utils.py
contains low-level functions. All users of
_get_params_preserve() changed, too - get_params(),
get_param(). Message.get_filename() returns either
ASCII or Unicode string.

----------------------------------------------------------------------

>Comment By: Barry A. Warsaw (bwarsaw)
Date: 2002-06-29 01:50

Message:
Logged In: YES 
user_id=12800

Thanks Oleg!  Sorry for the delay.  I've accepted this patch
and backported it to Python 2.1 (which the email package
must still support).   Will commit it to Python 2.3 cvs
momentarily.


----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 08:26

Message:
Logged In: YES 
user_id=4799

New patch uploaded.

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 08:24

Message:
Logged In: YES 
user_id=4799

> .encode('ascii')

Agree.

> languge

Fixed.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 07:58

Message:
Logged In: YES 
user_id=21627

If it really *has* to be ASCII, please be explicit about
this, invoking .encode('ascii'). I still wonder whether this
could raise a UnicodeError, though.

Another comment: 'languge' is spelled incorrectly in a few
places.

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 07:50

Message:
Logged In: YES 
user_id=4799

> I discourage the use of the default encoding. Instead, if
an encoding is present, a Unicode object, or the information
about the original encoding should be returned.

This particular function (_formatparam) must return an ASCII
string, not an Unicode object. The resulting string is put
into a header.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-29 07:34

Message:
Logged In: YES 
user_id=21627

The default encoding is the one returned by
sys.getdefaultencoding(). If this returns, on your system,
say, 'koi-8r', then testing the patch with koi-8r is
equivalent to testing it with ASCII only in a standard
installation.

In your patch, the line

  value = unicode(value[2], value[0]).encode()

makes use of the default encoding in the .encode call; this
call should always have an argument - it will fail if
value[0] differs from the default encoding, and characters
from the set difference between the encodings are used in
value[2].

----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 06:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Oleg Broytmann (phd)
Date: 2002-04-29 06:40

Message:
Logged In: YES 
user_id=4799

> Did you test this code with non-ASCII messages?

I did.

> I discourage the use of the default encoding.

What is the "default encoding" in this context?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-04-26 15:58

Message:
Logged In: YES 
user_id=21627

Did you test this code with non-ASCII messages?

I discourage the use of the default encoding. Instead, if an
encoding is present, a Unicode object, or the information
about the original encoding should be returned. If
absolutely necessary, conversion to the default encoding is
acceptable if UnicodeError is caught for the encoding to the
default encoding.

I'm not sure how to deal with UnicodeErrors when
constructing the Unicode object: you probably should create
an exception, but have that exception carry the data that
you caused the problem, so that the caller has the
opportunity to process them by other means.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=549133&group_id=5470