[Python-ideas] Stop displaying elements of bytes objects as printable ASCII characters in CPython 3
Cory Benfield
cory at lukasa.co.uk
Wed Sep 10 19:51:57 CEST 2014
On 10 September 2014 17:59, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> So does 0xDEADBEEF, but actually that's *not* text, it's a 32-bit
> pointer, conveniently invalid on most 32-bit architectures and very
> obvious when it shows up in a backtrace. Do you see an impedence
> mismatch in the C community because of that?
>
> In fact, *all* bytes "look like text", because *you can't see them
> until they're converted to text by repr()*! This is the key to the
> putative "impedence mismatch" -- it's perceived as such when people
> don't distinguish the map from the territory.
I apologise, I was insufficiently clear. I mean that interaction with
the bytes type in Python has a lot of textual aspects to it. This is a
*deliberate* decision (or at least the documentation makes it seem
deliberate), and I can understand the rationale, but it's hard to be
surprised that it leads developers astray.
Also, while I'm being picky, 0xDEADBEEF is not a 32-bit pointer, it's
a 32-bit something. Its type is undefined in that expression. It has a
standard usage as a guard word, but still, let's not jump to
conclusions here!
I accept your core point, however, which I consider to be this:
> The issue that sometimes it's easier to read hex than ASCII mixed with
> other stuff (hex escapes or Latin-1) is true enough, though. But it's
> not about an impedence mismatch, it's a question of what does *this*
> developer consider to be the convenient repr for *that* task.
This is definitely true, which I believe I've already admitted in this
thread. I do happen to believe that having it be hex would provide a
better pedagogical position ("you know this isn't text because it
looks like gibberish!"), but that ship sailed a long time ago.
More information about the Python-ideas
mailing list