[Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]
Antoine Pitrou
solipsis at pitrou.net
Fri Sep 17 12:10:26 CEST 2010
Le jeudi 16 septembre 2010 à 22:51 -0400, R. David Murray a écrit :
> > > On disk, using utf-8,
> > > one might store the text representation of the message, rather than
> > > the wire-format (ASCII encoded) version. We might want to write such
> > > messages from scratch.
> >
> > But then the user knows the encoding (by "user" I mean what/whoever
> > calls the email API) and mentions it to the email package.
>
> Yes? And then? The email package still has to parse the file, and it
> can't use its normal parse-the-RFC-data parser because the file could
> contain *legitimate* non-ASCII header data. So there has to be a separate
> parser for this case that will convert the non-ASCII data into RFC2047
> encoded data. At that point you have two parsers that share a bunch of
> code...and my current implementation lets the input to the second parser
> be text, which is the natural representation of that data, the one the
> user or application writer is going to expect.
But you said it yourself: that "e-mail-like data" data is not an email.
You could have a separate converter class for these special cases.
Also, I don't understand why an application would want to assemble an
e-mail by itself if it doesn't know how to do so, and produces wrong
data. Why not simply let the application do:
m = Message()
m.add_header("From", "Accented Bàrry <barry at python.org>")
m.add_body("Hello Barry")
> > And then you have two separate worlds while ultimately the same
> > concepts are underlying. A library accepting BytesMessage will crash
> > when a program wants to give a StringMessage and vice-versa. That
> > doesn't sound very practical.
>
> Yes, and a library accepting bytes will crash when a program wants
> to give it a string. So? That's how Python3 works. Unless, of
> course, the application decides to be polymorphic :)
Well, the application wants to handle abstracted e-mail messages. I'm
sure people would rather not deal with the difference(s) between
BytesMessages and StringMessages.
That's like saying we should have BytesConfigParser for bytes
configuration files and StringConfigParser for string configuration
files, with incompatible APIs.
("surrogateescape")
> On the other hand, that might be a way to make the current API work
> at least a little bit better with 8bit input data. I'll have to think
> about that...
Yes, that's what I was talking about.
You can even choose ("ascii", "surrogateescape") if you don't want to
wrongly choose an 8-bit encoding such as utf-8 or latin-1.
(I'm deliberately ignoring the case where people would use a non-ASCII
compatible encoding such as utf-16; I hope you don't want to support
that :-))
Regards
Antoine.
More information about the Python-Dev
mailing list