[Python-ideas] Add encoding attribute to bytes

Terry Reedy tjreedy at udel.edu
Tue Nov 10 03:44:11 CET 2009


Georg Brandl wrote:

>> What I do not know if it is feasible to give an immutable instance of a 
>> builtin class a mutable attribute slot.
> 
> As soon as you can mutate an instance, it is not an immutable type anymore.
> Calling it "immutable" despite will cause trouble.  (The same bytes instance
> could be used somewhere else transparently, e.g. as a function default
> argument, or cached as a constant local.)

OK, scratch that implementation of my idea.
> 
> As for the usefulness, I often have to work with proprietary communication
> protocols between computer and devices, and there the bytes have no encoding
> whatsoever

Random bits? It seems to me that protocol means some sort of encoding, 
formatting, or structuring, some sort of agreed on interpretation, even 
if private.

 > (though I agree that most bytes do have a meaningful encoding).
> However, a class as fundamental as "bytes" should not be burdened with an
> attribute that may not even apply -- it's easy to make a custom class to
> represent a (bytes, encoding) pair.

The fundamental problem I am interested in is the separation of raw data 
from how to use it info. Text encoding of bytes in only one instance, 
though the most common that pops up on Python list. I had also thought 
of something like (imcomplete):

class Textbytes:
   def __init__(self, text, code):
     if type(text) is str:
       text = text.encode(code)
     if type(text) is bytes:
        self.text = text
        self.code = code
     else:
       raise ValueError()
   def __str__(self):
     return self.text.decode(self.code)

b = Textbytes('abc', 'utf8')
print(b)

One problem is that it is a lot bulkier than a raw bytes. Leaving that 
aside, a custom class is just that: custom. Stdlib modules will neither 
accept nor produce such a wrapper rathar than bytes.

My underlying idea is that maybe the standard Python distribution should 
promote encapsulation of encoding info with raw bytes to make bug-free 
usage easier. Adding an attribute was one implementation idea. Adding a 
standardized wrapper class (at least in a module) would be another.

Terry Jan Reedy




More information about the Python-ideas mailing list