Proposal: make float.__str__ identical to float__repr__ in Python 3.2

Now that we've got the short float repr in Python, there's less value in having float.__str__ truncate to 12 significant digits (as it currently does). For Python 3.2, I propose making float.__str__ use the same algorithm as float.__repr__ for its output (and similarly for complex). Apart from simplifying the internals a little bit, one nice feature of this change is that it removes the differences in formatting between printing a float and printing a container of floats:
l = [1/3, 1/5, 1/7] print(l) [0.3333333333333333, 0.2, 0.14285714285714285] print(l[0], l[1], l[2]) 0.333333333333 0.2 0.142857142857
Any thoughts or comments on this? There's a working patch at http://bugs.python.org/issue9337 Mark

On 29/07/2010 19:47, Mark Dickinson wrote:
Now that we've got the short float repr in Python, there's less value in having float.__str__ truncate to 12 significant digits (as it currently does). For Python 3.2, I propose making float.__str__ use the same algorithm as float.__repr__ for its output (and similarly for complex).
Apart from simplifying the internals a little bit, one nice feature of this change is that it removes the differences in formatting between printing a float and printing a container of floats:
l = [1/3, 1/5, 1/7] print(l)
[0.3333333333333333, 0.2, 0.14285714285714285]
print(l[0], l[1], l[2])
0.333333333333 0.2 0.142857142857
Any thoughts or comments on this?
There's a working patch at http://bugs.python.org/issue9337
+1 Michael
Mark _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

Wiadomość napisana przez Mark Dickinson w dniu 2010-07-29, o godz. 20:47:
Now that we've got the short float repr in Python, there's less value in having float.__str__ truncate to 12 significant digits (as it currently does). For Python 3.2, I propose making float.__str__ use the same algorithm as float.__repr__ for its output (and similarly for complex).
Any thoughts or comments on this?
+1 -- Best regards, Łukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/

On Jul 29, 2010, at 11:47 AM, Mark Dickinson wrote:
Now that we've got the short float repr in Python, there's less value in having float.__str__ truncate to 12 significant digits (as it currently does). For Python 3.2, I propose making float.__str__ use the same algorithm as float.__repr__ for its output (and similarly for complex).
When you proposed the idea at EuroPython, it seemed reasonable but we didn't go into the pros and cons. The downsides include breaking tests, changing the output of report generating scripts that aren't using string formatting, and it introduces another inter-version incompatibility. The only obvious advantage is that it makes float.__repr__ and float.__str__ the same, making one less thing to explain. Can you elaborate on other advantages? Is there something wrong with the current way? IIRC, some other tools like matlab have a relatively compact default display size for floats, perhaps because formatting matrices becomes awkward when there are too many digits shown and because many of those digits are insignificant. Also, I think those tools have a way to globally change the default number of digits. Am curious about your thoughts on the problem we're trying to solve and the implications of changing the default. Raymond

On Thu, Jul 29, 2010 at 8:16 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
On Jul 29, 2010, at 11:47 AM, Mark Dickinson wrote:
Now that we've got the short float repr in Python, there's less value in having float.__str__ truncate to 12 significant digits (as it currently does). For Python 3.2, I propose making float.__str__ use the same algorithm as float.__repr__ for its output (and similarly for complex).
When you proposed the idea at EuroPython, it seemed reasonable but we didn't go into the pros and cons. The downsides include breaking tests, changing the output of report generating scripts that aren't using string formatting, and it introduces another inter-version incompatibility.
Yes, I agree that the change has potential for breakage; it's a change that probably would have been unacceptable for Python 2.7; for Python 3.2 I think there's a little more scope, since 3.x has fewer users. And those users it does have at the moment are the early adopters, who with any luck may be more tolerant of this level of breakage. (By the time we get to 3.2 -> 3.3 that's probably not going to be true any more.) Really, this change should have gone into 3.1. FWIW, the change broke very few of the standard library tests (as Eric Smith verified): there was a (somewhat buggy) doctest in test_tokenize that needed fixing, and test_unicodedata computes a checksum that depends on the str() of various numeric values. Apart from those, only test_float and test_complex needed fixing to reflect the __str__ method changes.
The only obvious advantage is that it makes float.__repr__ and float.__str__ the same, making one less thing to explain. Can you elaborate on other advantages? Is there something wrong with the current way?
That's one advantage; as mentioned earlier the difference between str and repr causes confusion for floats in containers, where users don't realize that two different operations are being used. This is a genuine problem: I've answered questions about this a couple of times on the #python IRC channel. Another advantage is that is makes 'str' faithful: that is, if x and y are distinct floats then str(x) and str(y) are guaranteed distinct. I know I should know better, but I've been bitten by the lack of faithfulness a couple of times when debugging floating-point problems: I insert a "print(x, y)" line into the code for debugging purposes and still wonder why my 'assertEqual(x, y)' test is failing even though x and y look the same; only then do I remember that I need to use repr instead. As you say, it's just one less surprise, and one less thing to explain: a small shrinkage of the mental footprint of the language. Mark

On Thu, Jul 29, 2010 at 1:30 PM, Mark Dickinson <dickinsm@gmail.com> wrote:
On Thu, Jul 29, 2010 at 8:16 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
On Jul 29, 2010, at 11:47 AM, Mark Dickinson wrote:
Now that we've got the short float repr in Python, there's less value in having float.__str__ truncate to 12 significant digits (as it currently does). For Python 3.2, I propose making float.__str__ use the same algorithm as float.__repr__ for its output (and similarly for complex).
When you proposed the idea at EuroPython, it seemed reasonable but we didn't go into the pros and cons. The downsides include breaking tests, changing the output of report generating scripts that aren't using string formatting, and it introduces another inter-version incompatibility.
Yes, I agree that the change has potential for breakage; it's a change that probably would have been unacceptable for Python 2.7; for Python 3.2 I think there's a little more scope, since 3.x has fewer users. And those users it does have at the moment are the early adopters, who with any luck may be more tolerant of this level of breakage. (By the time we get to 3.2 -> 3.3 that's probably not going to be true any more.) Really, this change should have gone into 3.1.
FWIW, the change broke very few of the standard library tests (as Eric Smith verified): there was a (somewhat buggy) doctest in test_tokenize that needed fixing, and test_unicodedata computes a checksum that depends on the str() of various numeric values. Apart from those, only test_float and test_complex needed fixing to reflect the __str__ method changes.
The only obvious advantage is that it makes float.__repr__ and float.__str__ the same, making one less thing to explain. Can you elaborate on other advantages? Is there something wrong with the current way?
That's one advantage; as mentioned earlier the difference between str and repr causes confusion for floats in containers, where users don't realize that two different operations are being used. This is a genuine problem: I've answered questions about this a couple of times on the #python IRC channel.
Another advantage is that is makes 'str' faithful: that is, if x and y are distinct floats then str(x) and str(y) are guaranteed distinct. I know I should know better, but I've been bitten by the lack of faithfulness a couple of times when debugging floating-point problems: I insert a "print(x, y)" line into the code for debugging purposes and still wonder why my 'assertEqual(x, y)' test is failing even though x and y look the same; only then do I remember that I need to use repr instead.
As you say, it's just one less surprise, and one less thing to explain: a small shrinkage of the mental footprint of the language.
+1 from me for all the reasons Mark mentioned. -- --Guido van Rossum (python.org/~guido)

On 7/29/2010 4:30 PM, Mark Dickinson wrote:
As you say, it's just one less surprise, and one less thing to explain: a small shrinkage of the mental footprint of the language.
With this change, I believe the only difference between str(ob) and repr(ob) will be the addition of quotes. If so, perhaps this could be noted in the repr entry. Terry Jan Reedy

When you proposed the idea at EuroPython, it seemed reasonable but we didn't go into the pros and cons. The downsides include breaking tests, changing the output of report generating scripts that aren't using string formatting, and it introduces another inter-version incompatibility.
Yes, I agree that the change has potential for breakage; it's a change that probably would have been unacceptable for Python 2.7; for Python 3.2 I think there's a little more scope, since 3.x has fewer users.
+1 for making the change to 3.2 +0 for 2.7
The only obvious advantage is that it makes float.__repr__ and float.__str__ the same, making one less thing to explain. Can you elaborate on other advantages? Is there something wrong with the current way?
That's one advantage; as mentioned earlier the difference between str and repr causes confusion for floats in containers, where users don't realize that two different operations are being used. This is a genuine problem: I've answered questions about this a couple of times on the #python IRC channel.
Another advantage is that is makes 'str' faithful: that is, if x and y are distinct floats then str(x) and str(y) are guaranteed distinct. I know I should know better, but I've been bitten by the lack of faithfulness a couple of times when debugging floating-point problems: I insert a "print(x, y)" line into the code for debugging purposes and still wonder why my 'assertEqual(x, y)' test is failing even though x and y look the same; only then do I remember that I need to use repr instead.
As you say, it's just one less surprise, and one less thing to explain: a small shrinkage of the mental footprint of the language.
Thanks for listing the advantages. Sounds like it is worth the cost. It also really calls into question whether there are good reasons for other types to have a __str__ that is different than their __repr__. Raymond

2010/7/29 Raymond Hettinger <raymond.hettinger@gmail.com>:
When you proposed the idea at EuroPython, it seemed reasonable but we didn't go into the pros and cons. The downsides include breaking tests, changing the output of report generating scripts that aren't using string formatting, and it introduces another inter-version incompatibility.
Yes, I agree that the change has potential for breakage; it's a change that probably would have been unacceptable for Python 2.7; for Python 3.2 I think there's a little more scope, since 3.x has fewer users.
+1 for making the change to 3.2 +0 for 2.7
-1 for 2.7. -- Regards, Benjamin

On Thu, Jul 29, 2010 at 3:30 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote:
It also really calls into question whether there are good reasons for other types to have a __str__ that is different than their __repr__.
Maybe, but there is tons of 3rd party code that uses this distinction. -- --Guido van Rossum (python.org/~guido)

On Thu, Jul 29, 2010 at 6:30 PM, Raymond Hettinger <raymond.hettinger@gmail.com> wrote: ..
It also really calls into question whether there are good reasons for other types to have a __str__ that is different than their __repr__.
For strings, the distinction is very useful. In this and many other cases unifying str and repr would mean making a choice between readability and parseability.

Mark Dickinson wrote:
Now that we've got the short float repr in Python, there's less value in having float.__str__ truncate to 12 significant digits (as it currently does). For Python 3.2, I propose making float.__str__ use the same algorithm as float.__repr__ for its output (and similarly for complex).
Apart from simplifying the internals a little bit, one nice feature of this change is that it removes the differences in formatting between printing a float and printing a container of floats:
l = [1/3, 1/5, 1/7] print(l) [0.3333333333333333, 0.2, 0.14285714285714285] print(l[0], l[1], l[2]) 0.333333333333 0.2 0.142857142857
Any thoughts or comments on this?
There's a working patch at http://bugs.python.org/issue9337
Python 2.5.4 (r254:67916, Jan 20 2010, 21:44:03)
float("0.142857142857") * 7 0.99999999999899991 float("0.14285714285714285") * 7 1.0
I've made a number of tools in the past that needed to round-trip a float through a string and back. I was under the impression that floats needed 17 decimal digits to avoid losing precision. How does one do that efficiently if neither str nor repr return 17 digits? Robert Brewer fumanchu@aminus.org

On 29/07/2010 22:37, Robert Brewer wrote:
Mark Dickinson wrote:
Now that we've got the short float repr in Python, there's less value in having float.__str__ truncate to 12 significant digits (as it currently does). For Python 3.2, I propose making float.__str__ use the same algorithm as float.__repr__ for its output (and similarly for complex).
Apart from simplifying the internals a little bit, one nice feature of this change is that it removes the differences in formatting between printing a float and printing a container of floats:
l = [1/3, 1/5, 1/7] print(l)
[0.3333333333333333, 0.2, 0.14285714285714285]
print(l[0], l[1], l[2])
0.333333333333 0.2 0.142857142857
Any thoughts or comments on this?
There's a working patch at http://bugs.python.org/issue9337
Python 2.5.4 (r254:67916, Jan 20 2010, 21:44:03)
float("0.142857142857") * 7
0.99999999999899991
float("0.14285714285714285") * 7
1.0
I've made a number of tools in the past that needed to round-trip a float through a string and back. I was under the impression that floats needed 17 decimal digits to avoid losing precision. How does one do that efficiently if neither str nor repr return 17 digits?
Because every floating point number represents a range of values - and the new algorithm uses the shortest possible representation within that range of values that will return the same floating point number. Mark did an excellent presentation on this at EuroPython and the slides are online: http://www.slideshare.net/dickinsm/magical-float-repr Michael Foord
Robert Brewer fumanchu@aminus.org _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.u...
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.
participants (9)
-
Alexander Belopolsky
-
Benjamin Peterson
-
Guido van Rossum
-
Mark Dickinson
-
Michael Foord
-
Raymond Hettinger
-
Robert Brewer
-
Terry Reedy
-
Łukasz Langa