On 6/2/2011 1:37 AM, Nick Coghlan wrote:
On Thu, Jun 2, 2011 at 3:58 AM, Ethan Furman<ethan@stoneleaf.us> wrote:
A byte stream with multiple encodings? Now *that* seems wrong!
Unicode encodings are just one serialisation format specific to text data. bytes objects may contain *any* serialisation format (e.g. zip archives, Python pickles, Python marshal files, packed binary data, innumerable wire protocols both standard and proprietary).
One result of this thread is that I see much better the value of separating the ancient human level concepts of character and text from the (3) decades old computer concept of byte. Numbers, lists, and dicts are other old human concepts. As Nick implies above, bytes (or bits within them) are used to encode all data for computer processing. The confusion of character with byte in the original design of Python both privileged and burdened text processing. -- Terry Jan Reedy