
Nick Coghlan wrote:
Terry Reedy wrote:
As for the usefulness, I often have to work with proprietary communication protocols between computer and devices, and there the bytes have no encoding whatsoever Random bits? It seems to me that protocol means some sort of encoding, formatting, or structuring, some sort of agreed on interpretation, even if private.
This is true, but the encoding scheme *isn't* a property of the binary data in and of itself. It's metadata about it that guides the application as to how the stream should be interpreted.
For a lot of the things I've done in the past, I haven't cared at all about the encoding of binary data - I've just been schlepping bits from point A to point B and back without caring what they actually *meant*. Other times I didn't have to guess or pass any metadata around because the comms port was hardwired to a particular device that only knew one way of communicating - the definition of the protocol was implicit in the implementation of the interface software.
In fact, one of the key features typically desired in a communications protocol is for it to be content neutral: you push binary data in one end and get the same binary data out of the other end. Peer applications using the channel to communicate with each other don't need to care what the channel is doing with the data, but equally importantly, the software implementing the comms channel doesn't need to know how to interpret the bits it is transporting*.
For other applications, the Unicode encoding might be important to know. Some will care more about the MIME type, or use some other defined binary encoding (what is the Unicode encoding of an sqlite or bsddb database file?). Other applications may be interested in a proprietary binary format that is formally defined solely by the code that knows how to read and write it.
Can bytes be used to store encoded Unicode data? Sure they can. But they can be used for a whole host of other things as well, so burdening them with an attribute that is occasional helpful, but more often dead weight or even outright misleading would be a mistake.
Cheers, Nick.
* Sometimes a bit more coupling makes sense when there are engineering advantages to be had, but this is usually an application specific thing (e.g. IP has a protocol field that identifies different application layer protocols such as TCP, UDP and ESP which have different network performance expectations, This allows IP network routers to apply different rules without having to peek inside the payload of each IP packet)
Your experience has been different from mine. Thanks for the exposition. I can see why you prefer metadata to either be in the stream itself or as part of a wrapper object. Terry Jan Reedy