Make bytes __repr__ and __str__ representation different?

Currently, __repr__ and __str__ representation of bytes is the same. Perhaps it is worth making them different, this will make it easier to visually perceive them as a container of integers from 0 to 255, instead of a mixture of printable and non-printable ascii characters. It is proposed: a) __str__ - leave unchanged b) __repr__ - represent as sequence of escaped hex
As you can see, the second example is more easily perceived as a sequence, in which '\' is also perceived as ',' in list or tuple. In addition, 2020 is close, it allows the new Pythonistas not to take them as an ascii mixture strings. With kind regards, -gdg

On Tue, Nov 21, 2017 at 05:37:36PM +0300, Kirill Balunov wrote:
I'd rather leave __str__ and __repr__ alone. Changing them will have huge backwards compatibility implications. I'd rather give bytes a hexdump() method that returns a string: '2a 2b 2d 2e 2f' (possibly with optional arguments to specify the formatting).
As you can see, the second example is more easily perceived as a sequence, in which '\' is also perceived as ',' in list or tuple.
I disagree. And if you perceive \ as a separator, why does the sequence start with a separator? And why are there so many x characters?
The special role of ASCII is far too important for us to ever completely discard it. -- Steve

Hi, While it may shock you, using bytes for "text" makes sense in some areas. Please read the Motivation of the PEP 461: https://www.python.org/dev/peps/pep-0461/#motivation Victor 2017-11-21 15:37 GMT+01:00 Kirill Balunov <kirillbalunov@gmail.com>:

On Tue, Nov 21, 2017 at 6:37 AM, Kirill Balunov <kirillbalunov@gmail.com> wrote:
supposedly __repr__ is supposed to give an eval-able version -- which your proposal is. But the way you did your example indicates that: bytes((42, 43, 44, 45, 46)) would be an even better __repr__, if the goal is to make it clear and easy that it is a "container of integers from 0 to 255" I've been programming since quite some time ago, and hex has NEVER come naturally to me :-) But backward compatibility and all that :-( -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

2017-11-21 20:22 GMT+03:00 Chris Barker <chris.barker@noaa.gov> wrote:
But the way you did your example indicates that:
Yes, it is better, but it seemed too radical to me:) 2017-11-21 18:16 GMT+03:00 Steven D'Aprano <steve@pearwood.info> wrote:
Since Python 3.5 bytes has a .hex() method, the same as yours .hexdump() but without spaces. But still it is a string. 2017-11-21 18:38 GMT+03:00 Victor Stinner <victor.stinner@gmail.com>:
It does not, because it is really useful feature. But rather, ascii was made so that it would fit into a byte, and not vice versa. Nevertheless, bytes are the strangest object in Python. It looks like a string (which contains only ascii), but it is not a string, because if you index, it does not return a byte -> bytes (b'123 '[0])! = Bytes (b'1'). May be it is closer to tuple, it is also immutable, but bytes(3) creates a sequence b '\ x00 \ x00 \ x00', but tuple not (and what the hell is b'\x00\x00\x00'?). Maybe it has some relationship to integers but int(b'1') == 1 when bytes([int(49)]) == b'1', i.e with integers it is not a friend either. It is bytes... With kind regards, -gdg

On Tue, Nov 21, 2017 at 11:22 AM, Chris Barker <chris.barker@noaa.gov> wrote:
I wonder if for repr-synonyms, a format specifier to `repr()` could toggle how the object chooses to display itself would be handy: x = b'*+-./' repr(x) # b'*+-./' repr(x, bytes.REPR_HEX_STRING) # b'\x2a\x2b\x2c\x2d\x2e' repr(x, bytes.REPR_BYTES) # bytes([42, 43, 44, 45, 46]) repr(x, bytes.REPR_HEX_BYTES) # bytes([0x2A, 0x2B, 0x2C, 0x2D, 0x2E]) Kinda like `format()` but such that all of `eval(repr(x, <whatever>))` are equal.

On Tue, Nov 21, 2017 at 05:37:36PM +0300, Kirill Balunov wrote:
I'd rather leave __str__ and __repr__ alone. Changing them will have huge backwards compatibility implications. I'd rather give bytes a hexdump() method that returns a string: '2a 2b 2d 2e 2f' (possibly with optional arguments to specify the formatting).
As you can see, the second example is more easily perceived as a sequence, in which '\' is also perceived as ',' in list or tuple.
I disagree. And if you perceive \ as a separator, why does the sequence start with a separator? And why are there so many x characters?
The special role of ASCII is far too important for us to ever completely discard it. -- Steve

Hi, While it may shock you, using bytes for "text" makes sense in some areas. Please read the Motivation of the PEP 461: https://www.python.org/dev/peps/pep-0461/#motivation Victor 2017-11-21 15:37 GMT+01:00 Kirill Balunov <kirillbalunov@gmail.com>:

On Tue, Nov 21, 2017 at 6:37 AM, Kirill Balunov <kirillbalunov@gmail.com> wrote:
supposedly __repr__ is supposed to give an eval-able version -- which your proposal is. But the way you did your example indicates that: bytes((42, 43, 44, 45, 46)) would be an even better __repr__, if the goal is to make it clear and easy that it is a "container of integers from 0 to 255" I've been programming since quite some time ago, and hex has NEVER come naturally to me :-) But backward compatibility and all that :-( -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

2017-11-21 20:22 GMT+03:00 Chris Barker <chris.barker@noaa.gov> wrote:
But the way you did your example indicates that:
Yes, it is better, but it seemed too radical to me:) 2017-11-21 18:16 GMT+03:00 Steven D'Aprano <steve@pearwood.info> wrote:
Since Python 3.5 bytes has a .hex() method, the same as yours .hexdump() but without spaces. But still it is a string. 2017-11-21 18:38 GMT+03:00 Victor Stinner <victor.stinner@gmail.com>:
It does not, because it is really useful feature. But rather, ascii was made so that it would fit into a byte, and not vice versa. Nevertheless, bytes are the strangest object in Python. It looks like a string (which contains only ascii), but it is not a string, because if you index, it does not return a byte -> bytes (b'123 '[0])! = Bytes (b'1'). May be it is closer to tuple, it is also immutable, but bytes(3) creates a sequence b '\ x00 \ x00 \ x00', but tuple not (and what the hell is b'\x00\x00\x00'?). Maybe it has some relationship to integers but int(b'1') == 1 when bytes([int(49)]) == b'1', i.e with integers it is not a friend either. It is bytes... With kind regards, -gdg

On Tue, Nov 21, 2017 at 11:22 AM, Chris Barker <chris.barker@noaa.gov> wrote:
I wonder if for repr-synonyms, a format specifier to `repr()` could toggle how the object chooses to display itself would be handy: x = b'*+-./' repr(x) # b'*+-./' repr(x, bytes.REPR_HEX_STRING) # b'\x2a\x2b\x2c\x2d\x2e' repr(x, bytes.REPR_BYTES) # bytes([42, 43, 44, 45, 46]) repr(x, bytes.REPR_HEX_BYTES) # bytes([0x2A, 0x2B, 0x2C, 0x2D, 0x2E]) Kinda like `format()` but such that all of `eval(repr(x, <whatever>))` are equal.
participants (6)
-
Chris Angelico
-
Chris Barker
-
Kirill Balunov
-
Nick Timkovich
-
Steven D'Aprano
-
Victor Stinner