[Python-ideas] Add encoding attribute to bytes

Tue Nov 10 05:30:22 CET 2009

Terry Reedy writes:

 > The fundamental problem I am interested in is the separation of raw data 
 > from how to use it info.

But this is ambiguous.  Take reStructuredText.  It *is* text/plain.
But it also *is* application/x-structuredtext.  Not to forget
application/octet-stream.  An MUA will treat it as the first, docutils
as the second, and gzip as the third.

 > My underlying idea is that maybe the standard Python distribution
 > should promote encapsulation of encoding info with raw bytes to
 > make bug-free usage easier.

I think you will find that every use case makes different demands on
this feature, and that it typically interacts with higher-level needs
of the application.  There's a reason that ASN.1 is insanely complex
and only applications that really need it ever use it.  This feature
will either be too simple to serve most practical needs, or too
complex to serve most practical programmers.<wink>

And "bug-free" usage is hopeless.  Much, perhaps the vast majority, of
the coding information will be automatically derived from sources you
deprecate as "heuristic", like MIME Content-Type headers.  It will get
attached to the bytes as an attribute, and after that you can't know
how reliable it is.

If you have a practical example of such a simple class (bytes +
encoding attribute) that serves as a base for more complex
applications, I'd really like to see them.  But until there are real
use cases on the table, I have to say I can't see the proposed
facility as being particularly useful to the email package, for
example.