[Python-ideas] Add encoding attribute to bytes
python at mrabarnett.plus.com
Fri Nov 6 03:19:35 CET 2009
Terry Reedy wrote:
> A Python interpreter has one encoding for floats, ints, and strings.
> sys.float_info and sys.int_info give details about the first two.
> although they are mostly invisible to user code. (I presume they are
> attached to sys rather than float and int precisely because this.) A
> couple of recent posts have discussed making the unicode encoding (UCS2
> v 4) both less visible and more discoverable to extensions.
> Bytes are nearly always an encoding of *something*, but the particular
> encoding used is instance-specific. As Guido has said, the programmer
> must keep track. But how? In an OO language, one obvious way is as an
> attribute of the instance. That would be carried with the instance and
> make it self-identifying.
> What I do not know if it is feasible to give an immutable instance of a
> builtin class a mutable attribute slot. If it were, I think this could
> make 3.x bytes easier and more transparent to use. When a string is
> encoded to bytes, the attribute would be set. If it were then pickled,
> the attribute would be stored with it and restored with it, and less
> easily lost. If it were then decoded, the attribute would be used. If it
> were sent to the net, the attribute would be used to set the appropriate
> headers. The reverse process would apply from net to bytes to (unicode)
> Bytes representing other types of data, such as nedia could also be
> tagged, not just those representing text.
> This would be a proposal for 3.3 at the earliest. It would involved
> revising stdlib modules, as appropriate, to use the new info.
You said "give an immutable instance of a builtin class a mutable
attribute slot". Why would the slot be mutable? Surely if the attribute
said that the bytes represented a certain type of data then you
shouldn't be able to change it. ("The attribute says that the bytes are
UTF-8, but I'm going to change it so that it says they are ISO-8859-1.")
I think that the attribute should be immutable.
More information about the Python-ideas