Mailman 3 Make bytes __repr__ and __str__ representation different? - Python-ideas

newer
Re: [Python-ideas] Should Python...

Make bytes repr and str representation different?

Kirill Balunov

Nov. 21, 2017

2:37 p.m.

Currently, __repr__ and __str__ representation of bytes is the same. Perhaps it is worth making them different, this will make it easier to visually perceive them as a container of integers from 0 to 255, instead of a mixture of printable and non-printable ascii characters. It is proposed: a) __str__ - leave unchanged b) __repr__ - represent as sequence of escaped hex

...

As you can see, the second example is more easily perceived as a sequence, in which '\' is also perceived as ',' in list or tuple. In addition, 2020 is close, it allows the new Pythonistas not to take them as an ascii mixture strings. With kind regards, -gdg

Attachments:

attachment.htm (text/html — 883 bytes)

Show replies by date

Steven D'Aprano

November 2017

3:16 p.m.

On Tue, Nov 21, 2017 at 05:37:36PM +0300, Kirill Balunov wrote:

...

Currently, __repr__ and __str__ representation of bytes is the same. Perhaps it is worth making them different, this will make it easier to visually perceive them as a container of integers from 0 to 255, instead of a mixture of printable and non-printable ascii characters. It is proposed:

a) __str__ - leave unchanged b) __repr__ - represent as sequence of escaped hex

...
...
...
a = bytes([42,43,44,45,46]) a # Current b'*+-./' a # Proposed b'\x2a\x2b\x2d\x2e\x2f'

I'd rather leave __str__ and __repr__ alone. Changing them will have huge backwards compatibility implications. I'd rather give bytes a hexdump() method that returns a string: '2a 2b 2d 2e 2f' (possibly with optional arguments to specify the formatting).

...

As you can see, the second example is more easily perceived as a sequence, in which '\' is also perceived as ',' in list or tuple.

I disagree. And if you perceive \ as a separator, why does the sequence start with a separator? And why are there so many x characters?

...

In addition, 2020 is close, it allows the new Pythonistas not to take them as an ascii mixture strings.

The special role of ASCII is far too important for us to ever completely discard it. -- Steve

Victor Stinner

3:38 p.m.

Hi, While it may shock you, using bytes for "text" makes sense in some areas. Please read the Motivation of the PEP 461: https://www.python.org/dev/peps/pep-0461/#motivation Victor 2017-11-21 15:37 GMT+01:00 Kirill Balunov <kirillbalunov@gmail.com>:

...

Chris Barker

5:22 p.m.

On Tue, Nov 21, 2017 at 6:37 AM, Kirill Balunov <kirillbalunov@gmail.com> wrote:

...

supposedly __repr__ is supposed to give an eval-able version -- which your proposal is. But the way you did your example indicates that: bytes((42, 43, 44, 45, 46)) would be an even better __repr__, if the goal is to make it clear and easy that it is a "container of integers from 0 to 255" I've been programming since quite some time ago, and hex has NEVER come naturally to me :-) But backward compatibility and all that :-( -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Kirill Balunov

8:27 p.m.

2017-11-21 20:22 GMT+03:00 Chris Barker <chris.barker@noaa.gov> wrote:

...

But the way you did your example indicates that:

...

bytes((42, 43, 44, 45, 46))

would be an even better __repr__, if the goal is to make it clear and easy that it is a "container of integers from 0 to 255"

I've been programming since quite some time ago, and hex has NEVER come naturally to me :-)

Yes, it is better, but it seemed too radical to me:) 2017-11-21 18:16 GMT+03:00 Steven D'Aprano <steve@pearwood.info> wrote:

...

I'd rather give bytes a hexdump() method that returns a string:

'2a 2b 2d 2e 2f'

(possibly with optional arguments to specify the formatting).

Since Python 3.5 bytes has a .hex() method, the same as yours .hexdump() but without spaces. But still it is a string. 2017-11-21 18:38 GMT+03:00 Victor Stinner <victor.stinner@gmail.com>:

...

While it may shock you, using bytes for "text" makes sense in some areas. Please read the Motivation of the PEP 461: https://www.python.org/dev/peps/pep-0461/#motivation

It does not, because it is really useful feature. But rather, ascii was made so that it would fit into a byte, and not vice versa. Nevertheless, bytes are the strangest object in Python. It looks like a string (which contains only ascii), but it is not a string, because if you index, it does not return a byte -> bytes (b'123 '[0])! = Bytes (b'1'). May be it is closer to tuple, it is also immutable, but bytes(3) creates a sequence b '\ x00 \ x00 \ x00', but tuple not (and what the hell is b'\x00\x00\x00'?). Maybe it has some relationship to integers but int(b'1') == 1 when bytes([int(49)]) == b'1', i.e with integers it is not a friend either. It is bytes... With kind regards, -gdg

Nick Timkovich

10:49 p.m.

On Tue, Nov 21, 2017 at 11:22 AM, Chris Barker <chris.barker@noaa.gov> wrote:

...

I wonder if for repr-synonyms, a format specifier to `repr()` could toggle how the object chooses to display itself would be handy: x = b'*+-./' repr(x) # b'*+-./' repr(x, bytes.REPR_HEX_STRING) # b'\x2a\x2b\x2c\x2d\x2e' repr(x, bytes.REPR_BYTES) # bytes([42, 43, 44, 45, 46]) repr(x, bytes.REPR_HEX_BYTES) # bytes([0x2A, 0x2B, 0x2C, 0x2D, 0x2E]) Kinda like `format()` but such that all of `eval(repr(x, <whatever>))` are equal.

Chris Angelico

12:51 a.m.

On Wed, Nov 22, 2017 at 9:49 AM, Nick Timkovich <prometheus235@gmail.com> wrote:

...

Methods are usually the best for that. Possibly with class methods to perform the reconstruction - which in this case you have:

...

ChrisA

Steven D'Aprano

November 2017

3:16 p.m.

On Tue, Nov 21, 2017 at 05:37:36PM +0300, Kirill Balunov wrote:

...

Currently, __repr__ and __str__ representation of bytes is the same. Perhaps it is worth making them different, this will make it easier to visually perceive them as a container of integers from 0 to 255, instead of a mixture of printable and non-printable ascii characters. It is proposed:

a) __str__ - leave unchanged b) __repr__ - represent as sequence of escaped hex

...
...
...
a = bytes([42,43,44,45,46]) a # Current b'*+-./' a # Proposed b'\x2a\x2b\x2d\x2e\x2f'

...

As you can see, the second example is more easily perceived as a sequence, in which '\' is also perceived as ',' in list or tuple.

I disagree. And if you perceive \ as a separator, why does the sequence start with a separator? And why are there so many x characters?

...

In addition, 2020 is close, it allows the new Pythonistas not to take them as an ascii mixture strings.

The special role of ASCII is far too important for us to ever completely discard it. -- Steve

Victor Stinner

3:38 p.m.

...

Chris Barker

5:22 p.m.

On Tue, Nov 21, 2017 at 6:37 AM, Kirill Balunov <kirillbalunov@gmail.com> wrote:

...

Kirill Balunov

8:27 p.m.

2017-11-21 20:22 GMT+03:00 Chris Barker <chris.barker@noaa.gov> wrote:

...

But the way you did your example indicates that:

...

bytes((42, 43, 44, 45, 46))

would be an even better __repr__, if the goal is to make it clear and easy that it is a "container of integers from 0 to 255"

I've been programming since quite some time ago, and hex has NEVER come naturally to me :-)

Yes, it is better, but it seemed too radical to me:) 2017-11-21 18:16 GMT+03:00 Steven D'Aprano <steve@pearwood.info> wrote:

...

I'd rather give bytes a hexdump() method that returns a string:

'2a 2b 2d 2e 2f'

(possibly with optional arguments to specify the formatting).

Since Python 3.5 bytes has a .hex() method, the same as yours .hexdump() but without spaces. But still it is a string. 2017-11-21 18:38 GMT+03:00 Victor Stinner <victor.stinner@gmail.com>:

...

While it may shock you, using bytes for "text" makes sense in some areas. Please read the Motivation of the PEP 461: https://www.python.org/dev/peps/pep-0461/#motivation