Chris Lasher writes:
Okay, but a definite -1e6 from me on making my Python interpreter do this:
>>> my_packed_bytes = struct.pack('ffff', 3.544294848931151e-12, 1.853266900760489e+25, 1.6215185358725202e-19, 0.9742483496665955) >>> my_packed_bytes b'Why, Guido? Why?'
If you actually have a struct, why aren't you wrapping your_packed_bytes in a class that validates the struct and displays it nicely formatted? Or, alternatively, simply replaces __repr__?
I do understand the utility of peering in to ASCII text, but like Cory Benfield stated earlier:
I'm saying that I don't get to do debugging with a simple print statement when using the bytes type to do actual binary work, while those who are doing sort-of binary work do.
Does the inconvenience of having to explicitly call the .asciify() method on a bytes object justify the current behavior for repr() on a bytes object?
Yes. A choice must be made, because a type has only one repr, and there's no syntax for choosing it. It's a question of whose use case is going to become more convenient and whose becomes less so, and either choice is *justified*. Which is *preferred* is a judgment call. Your judgment doesn't rule, and it definitely doesn't have a weight of 1e6. At this point even Guido's judgment is likely to be dominated by backward compatibility, no matter how much he regrets the necessity. (But I would bet he doesn't regret it at all.)
The privilege of being lazy is obstructing the right to see what we've actually got in the bytes object, and is jeopardizing the very argument that "bytes are not strings".
It does not jeopardize the *fact* that bytes are not strings. People who don't understand that have a fundamental confusion, and they're going to want bytes to DWIM when mixed with str in their applications. And they'll complain when their bytes don't DWIM, and they'll complain even more when the repr "obstructs the right to see what they've actually got in the bytes object", which (in their applications) is a stream containing tokens borrowed from English using the ASCII coded character set. I agree with you that they're wrong. My point is that they're wrong in such a way that they won't understand that bytes aren't text strings any better merely because they become harder to read. They *know* that there's a text string in there because they put it there! Cory Benfield wrote and Chris Lasher quoted:
Also, while I'm being picky, 0xDEADBEEF is not a 32-bit pointer, it's a 32-bit something. Its type is undefined in that It has a standard usage as a guard word, but still, let's not jump to conclusions here!
I was not jumping to conclusions. I was setting up a scenario. The actual use case is something like "int *pi = 0xDEADBEEF;". The point is that C programmers are deliberately choosing a guard word that is readable when printed as hexadecimal, and also satisfies certain restrictions when those bytes are used as a pointer. That doesn't mean that they are confusing text with pointers. The same is true for Python's repr for bytes.
I do happen to believe that having it be hex would provide a better pedagogical position ("you know this isn't text because it looks like gibberish!"), but that ship sailed a long time ago.
I don't think a gibberish repr will confuse people who think that bytes are text in their application. They'll just get more peeved at Python 3, because they know that there's readable text in there, and Python 3 "obstructs their right to see what's actually in the bytes object". Regards,