Mailman 3 Re: [Python-Dev] [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c - Python-Dev

newer
Summary of Python tracker Issues

Re: [Python-Dev] [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

older
problems compiling ctypes

Guido van Rossum

26 Jun 2008 26 Jun '08

2:19 a.m.

[Adding back the list.] On Tue, Jun 24, 2008 at 9:53 PM, Raymond Hettinger wrote:

...

...
While I think it's fine to have some function that reveals the binary representation of floats, I don't think that overlaying this on hex/oct/bin is worth the problems it causes.

What problems? The patch is clean.

Problems like no two people on python-dev agreeing on how exactly the feature should be implemented. Problems like whether this goes way beyond the philosophical underpinnings of bin/oct/hex. Problems like what to do about other types that might want to overload hex/oct/bin. See Kevin Jacobs' response.

...

...
This API appears to be purely for educational purposes; why not implement something in pure Python using the struct module that reveals the lay-out of the floating-point value?

This is not the internal floating point layout. It is the real value expressed in exponential form. It is more than educational -- it is a platform independent representation (look at Terry's reference -- it is the usual way to precisely specify a float value and it does not depend on atof() or vice versa).

Possibly, but it is only readable by a Python expression parser. For all practical purposes "%.17g" % x works just as well. And this bypasses the question "why overload this functionality on bin/hex/oct rather than adding e.g. a new function to math or a new method to float."

...

...
(There are also several things wrong with the specific patch, apart from its lack of docs; #1 is the introduction of an externaly visible symbol that doesn't start with _Py.)

Will change the global symbol to _Py. I already added docs to the patch. Did you see the one that was uploaded a few hours ago (float6.diff)?

I don't care about the details of the patch until we have agreement about which form the feature should take. We don't have that agreement yet. I mentioned the flaws in the patch to point out that it was apparently a rush job.

...

I re-opened the discussion at your behest. [...]

I'm very glad you're giving the discussion a second chance. Please give it a few days at least. My expectation is that the outcome will be not to overload bin/hex/oct but to add a custom function to math or a custom method to float, whose output can be further massaged to create the platform-independent representation you're after. (I doubt that it's worth changing pickle or marshal though, they are doing fine with their respective current approaches.) -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Show replies by date

Raymond Hettinger

26 Jun 26 Jun

4:54 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

From: "Guido van Rossum"

...

I don't care about the details of the patch until we have agreement about which form the feature should take. We don't have that agreement yet.

Updated to the patch to address everyone's review comments: http://bugs.python.org/file10742/float8.diff * Alexander Belopolsky requested exponential notation instead of potentially very long strings of bits. Done * Alexander Belopolsky requested true mathematical radix 2 representation of a float rather than its 64-bit memory layout. Done * Antoine Pitrou requested that hex() and oct() be supported as well as bin(). Terry J. Reedy also requested support for hex(). Done. * Alexander Belopolsky and Alexandre Vassalotti requested that the output be a two-way street -- something that can be round-tripped through eval(). Done. * Amaury Forgeot d'Arc requested that the implementation not manipulate C strings inplace. Fixed -- used PyUnicode_FromFormat() instead. * Amaury Forgeot d'Arc requested that tests should check if negative numbers have the same representation as their absolute value. Done. * Mark Dickinson requested sign preserving output for bin(-0.0). We couldn't find a clean way to do this without a special cased output format. * Mark Dickinson reviewed the NaN/Inf handling. Done. * Eric Smith requested that the routine be attached to _PyFloat_to_base() instead of attaching to __bin__, __oct__, and __hex__. Done. * Guido requested that the docs be updated. Done. * Guido requested that the globally visible C API function name be prefixed with _Py. Done. * Mark Dickinson requested normalizing output to start with a 1 so that nearby values have similar reprs. Done. Raymond

Paul Moore

5:46 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On 26/06/2008, Raymond Hettinger wrote:

...

From: "Guido van Rossum"

...
I don't care about the details of the patch until we have agreement about which form the feature should take. We don't have that agreement yet.

Updated to the patch to address everyone's review comments: http://bugs.python.org/file10742/float8.diff

Just as a contrary point, I'm not particularly keen on the output format (which takes the form '0b1 * 2.0 ** 0' as far as I can see), and I'm definitely not keen on the fact that it's overloaded on the hex/bin/oct builtins. Can't it be a separate function? If it is, I don't much care about the output format (as I have no particular need for the feature) but would it not be better if it were machine-parseable, rather than executable? Paul

Raymond Hettinger

6:20 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

...

Just as a contrary point, I'm not particularly keen on the output format (which takes the form '0b1 * 2.0 ** 0' as far as I can see),

That format was requested by everyone else on the tracker discussion. What I originally wanted was something like 0b11.0101. But that didn't round-trip through eval, it didn't match the style used in the numerical papers referenced by Terry Reedy, and it didn't scale well with inputs like 1.1E+100.

...

and I'm definitely not keen on the fact that it's overloaded on the hex/bin/oct builtins.

Can't it be a separate function?

Simplicity. bin/oct/hex have the job of giving alternate base representations for numbers. Nothing is gained by adding a duplicate set of functions in the math module for float inputs.

...

would it not be better if it were machine-parseable, rather than executable?

We already have struct.pack for machine-parseable output. This is supposed to be human readable as well as providing an exact, platform indepent way of specifying a particular float value (something that's useful in testing algorithms like that in math.sum()). Raymond

Guido van Rossum

7:38 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 5:50 AM, Raymond Hettinger wrote:

...

...
Just as a contrary point, I'm not particularly keen on the output format (which takes the form '0b1 * 2.0 ** 0' as far as I can see),

That format was requested by everyone else on the tracker discussion. What I originally wanted was something like 0b11.0101. But that didn't round-trip through eval, it didn't match the style used in the numerical papers referenced by Terry Reedy, and it didn't scale well with inputs like 1.1E+100.

...
and I'm definitely not keen on the fact that it's overloaded on the hex/bin/oct builtins.

Can't it be a separate function?

Simplicity. bin/oct/hex have the job of giving alternate base representations for numbers. Nothing is gained by adding a duplicate set of functions in the math module for float inputs.

I disagree, and others here have disagreed too. We made a consicous decision to *remove* the overloading of hex/oct/bin via __hex__/__oct__/__bin__ in 3.0, in order to simplify these functions, which only work for integers, not for any other style of numbers. If bin(3.4) works, why not bin() of a Fraction, or of a complex number? Or for that matter, why not hex() of a string? All these have some use case. But is that use case important enough to put it in the bin()/hex()/oct() built-in functions? Why not hex() of a dictionary? Where does it end? We drew a line in the sand -- these are only for ints.

...

...
would it not be better if it were machine-parseable, rather than executable?

We already have struct.pack for machine-parseable output. This is supposed to be human readable as well as providing an exact, platform indepent way of specifying a particular float value (something that's useful in testing algorithms like that in math.sum()).

The only use cases you bring up appear to be in testing and education. This is not a strong enough motivation for adding a wart to the bin/oct/hex builtins. I'm sure you can write the same thing in pure Python -- wouldn't that be good enough for testing? And if you then put it somewhere in a module in the stdlib, wouldn't that be good enough for education? There's a strong movement for keeping the language small and clean. Adding more overloaded functionality to built-in functions goes counter to that ideal. A new stdlib function causes no overhead to the *language* definition (and built-ins *are* part of the language). -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Nick Coghlan

7:49 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Raymond Hettinger wrote:

...

...
and I'm definitely not keen on the fact that it's overloaded on the hex/bin/oct builtins.

Can't it be a separate function?

Simplicity. bin/oct/hex have the job of giving alternate base representations for numbers. Nothing is gained by adding a duplicate set of functions in the math module for float inputs.

I'd place additional requirements on using bin/oct/hex for this: 1. The new feature must be available to floating point types other than float (such as Decimal) in both 2.6 and 3.0 (keeping in mind that 3.0 does not support __bin__, __hex__, or __oct__ methods - it uses only __index__ to implement bin(), hex() and oct() 2. Other classes (such as Decimal) should be able to leverage the formatting functionality provided for floats. If it was just a new method on float objects or a new function in the math module, neither of those additional requirements would apply - I would be completely fine with the function only working for actual float objects. However, in either case, I think this also runs afoul of the "we're in beta" argument - yes, it's a nice feature, but I don't think it's one that will cause any great dramas if users don't get their hands on it until 2.7/3.1. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Eric Smith

5:53 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Actually, after saying I was opposed to __bin__ in 2.6, I said: "Instead, I think the approach used in 3.0 (r64451) should be used instead. That is, if this feature exist at all. I'm -0 on adding bin(), etc. to floats." My last sentence is a little unclear. I meant I'm -0 on adding floats as arguments to bin(), oct(), and hex(). Primarily because a) it's not extensible in 3.0, and b) I find it surprising, in that I'd expect those functions to throw an error for non-integral types (that is, those not having __index__). I think adding a "float_as_binary_expression()" (with a better name) in some module would get the functionality you seek. What is gained by adding this to bin() and friends? Raymond Hettinger wrote:

...

From: "Guido van Rossum"

...
I don't care about the details of the patch until we have agreement about which form the feature should take. We don't have that agreement yet.

Updated to the patch to address everyone's review comments: http://bugs.python.org/file10742/float8.diff

* Alexander Belopolsky requested exponential notation instead of potentially very long strings of bits. Done

* Alexander Belopolsky requested true mathematical radix 2 representation of a float rather than its 64-bit memory layout. Done

* Antoine Pitrou requested that hex() and oct() be supported as well as bin(). Terry J. Reedy also requested support for hex(). Done.

* Alexander Belopolsky and Alexandre Vassalotti requested that the output be a two-way street -- something that can be round-tripped through eval(). Done.

* Amaury Forgeot d'Arc requested that the implementation not manipulate C strings inplace. Fixed -- used PyUnicode_FromFormat() instead.

* Amaury Forgeot d'Arc requested that tests should check if negative numbers have the same representation as their absolute value. Done.

* Mark Dickinson requested sign preserving output for bin(-0.0). We couldn't find a clean way to do this without a special cased output format.

* Mark Dickinson reviewed the NaN/Inf handling. Done.

* Eric Smith requested that the routine be attached to _PyFloat_to_base() instead of attaching to __bin__, __oct__, and __hex__. Done.

* Guido requested that the docs be updated. Done.

* Guido requested that the globally visible C API function name be prefixed with _Py. Done.

* Mark Dickinson requested normalizing output to start with a 1 so that nearby values have similar reprs. Done.

Raymond _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/eric%2Bpython-dev%40truebl...

Eric Smith

8:29 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Eric Smith wrote:

...

Actually, after saying I was opposed to __bin__ in 2.6, I said: "Instead, I think the approach used in 3.0 (r64451) should be used instead. That is, if this feature exist at all. I'm -0 on adding bin(), etc. to floats."

My last sentence is a little unclear. I meant I'm -0 on adding floats as arguments to bin(), oct(), and hex(). Primarily because a) it's not extensible in 3.0, and b) I find it surprising, in that I'd expect those functions to throw an error for non-integral types (that is, those not having __index__). I think adding a "float_as_binary_expression()" (with a better name) in some module would get the functionality you seek. What is gained by adding this to bin() and friends?

And to clarify myself, yet again (from private email with Raymond): I actually think it's a useful feature, just not in bin(). I'm sure it will land somewhere, and I'm also sure I'll use it, at least from the interactive prompt. And if bin() were generally extensible for all types, I wouldn't really even care all that much if this feature landed in bin(). But 3.0 is going in the opposite direction, which is what much of my concern is based on, and why I commented at the outset on the 2.6 approach differing from the 3.0 approach. Eric.

Paul Moore

8:57 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On 26/06/2008, Eric Smith wrote:

...

I actually think it's a useful feature, just not in bin(). I'm sure it will land somewhere, and I'm also sure I'll use it, at least from the interactive prompt.

Can you give an example of its use? Maybe there are such examples in the tracker discussion, but I'm not following the tracker, and I think it would add some weight to the discussion here if use cases were noted. At the moment, the impression I get is that most of the python-dev comments are negative, but (without looking to confirm) most of the tracker comments are positive. If so, getting everything in one place will make it easier to see a balanced discussion. Having said all this, I agree with the point that this should be deferred to 2.7/3.1 now we're in beta. So there's hardly a rush. Paul.

Antoine Pitrou

6:19 p.m.

New subject: [Python-Dev] [Python-checkins] r64424 - in python/trunk:Include/object.hLib/test/test_sys.py Misc/NEWSObjects/intobject.cObjects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Raymond Hettinger writes:

...

* Antoine Pitrou requested that hex() and oct() be supported as well as bin().

Just to qualify this, I said that if bin() were to gain float support, the same should probably be done for hex() and oct(). That doesn't mean I'm in favor of bin() support for floats. Regards Antoine.

Guido van Rossum

7:22 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Would you mind reading the rest of *this* thread on python-dev and respond to the discussion about the design of the feature? On Thu, Jun 26, 2008 at 4:24 AM, Raymond Hettinger wrote:

...

From: "Guido van Rossum"

...
I don't care about the details of the patch until we have agreement about which form the feature should take. We don't have that agreement yet.

Updated to the patch to address everyone's review comments: http://bugs.python.org/file10742/float8.diff

* Alexander Belopolsky requested exponential notation instead of potentially very long strings of bits. Done

* Alexander Belopolsky requested true mathematical radix 2 representation of a float rather than its 64-bit memory layout. Done

* Antoine Pitrou requested that hex() and oct() be supported as well as bin(). Terry J. Reedy also requested support for hex(). Done.

* Alexander Belopolsky and Alexandre Vassalotti requested that the output be a two-way street -- something that can be round-tripped through eval(). Done.

* Amaury Forgeot d'Arc requested that the implementation not manipulate C strings inplace. Fixed -- used PyUnicode_FromFormat() instead.

* Amaury Forgeot d'Arc requested that tests should check if negative numbers have the same representation as their absolute value. Done.

* Mark Dickinson requested sign preserving output for bin(-0.0). We couldn't find a clean way to do this without a special cased output format.

* Mark Dickinson reviewed the NaN/Inf handling. Done.

* Eric Smith requested that the routine be attached to _PyFloat_to_base() instead of attaching to __bin__, __oct__, and __hex__. Done.

* Guido requested that the docs be updated. Done.

* Guido requested that the globally visible C API function name be prefixed with _Py. Done.

* Mark Dickinson requested normalizing output to start with a 1 so that nearby values have similar reprs. Done.

Raymond

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Raymond Hettinger

7:42 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

...

Would you mind reading the rest of *this* thread on python-dev and respond to the discussion about the design of the feature?

The last four entries were from this thread. I don't know what else you want me to do. I can update the patch as people make suggestions. That's pretty much it. I recapped the earlier discussion from the tracker so the participants in this thread would be aware of the requests that were made there and why. I originally wanted a different output format but it evolved to the one that's there now to meet the various needs of the posters. This is important background for someone just joining the thread and thinking a different output format would be better. There's a part of this thread that says basically, "fine, but stick it somewhere else." To me, it doesn't make any sense at all to create a parallel set of functions in the math module. To convert a number to binary, it makes sense to use the bin() function. I don't understand this notion that bin() is a sacred cow of intergerdom and would philosophically corrupt if it handled floats also. Raymond

Nick Coghlan

8:10 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Raymond Hettinger wrote:

...

There's a part of this thread that says basically, "fine, but stick it somewhere else." To me, it doesn't make any sense at all to create a parallel set of functions in the math module. To convert a number to binary, it makes sense to use the bin() function. I don't understand this notion that bin() is a sacred cow of intergerdom and would philosophically corrupt if it handled floats also.

It isn't the extension to something other than integers that bothers me, it's the extension to floats in a way that can't be easily supported for other types such as decimal.Decimal and fractions.Fraction. Well, that and the beta deadline (we have to draw the line somewhere, or we'll be stuck in an eternal spiral of "just one more feature") Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Barry Warsaw

8:42 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 26, 2008, at 10:40 AM, Nick Coghlan wrote:

...

Well, that and the beta deadline (we have to draw the line somewhere, or we'll be stuck in an eternal spiral of "just one more feature")

Guido wanted to get the beta out when we did exactly so we could draw this line in the sand. I'd much rather people be spending time getting what features we do have tested, stabilized, bug fixed, and turning the buildbots green across the board. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iQCVAwUBSGOx6HEjvBPtnXfVAQK/IwP+MM75Kw0pMtzQ0wND51D6Gkyx/anr6xOf uLQ41AhcPd//XDTfMQOO9DYsks1sxt8UnDtajyAFGKK9tZmymhwttplJpBdFDSM7 4km2tV9TZZJQUI/VUwV/d6IT80xfdct6c+lkOjeOl2KMnkPEuu+AoLnG0Me8FKiX heC8FnyXut4= =pG60 -----END PGP SIGNATURE-----

Nick Coghlan

8:51 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Barry Warsaw wrote:

...

On Jun 26, 2008, at 10:40 AM, Nick Coghlan wrote:

...
Well, that and the beta deadline (we have to draw the line somewhere, or we'll be stuck in an eternal spiral of "just one more feature")

Guido wanted to get the beta out when we did exactly so we could draw this line in the sand. I'd much rather people be spending time getting what features we do have tested, stabilized, bug fixed, and turning the buildbots green across the board.

Having been caught by the 2.5 beta deadline with the changes that eventually became PEP 361 (and I think were significantly improved by the additional attention that was available due to the delay) I understand completely. (And to everyone with features that get bumped to 2.7/3.1 because of this... while a number of you no doubt know this already, it really is astonishing how soon the next release seems to roll around, even with our fairly leisurely release schedules!) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Guido van Rossum

8:58 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 8:21 AM, Nick Coghlan wrote:

...

Barry Warsaw wrote:

...
On Jun 26, 2008, at 10:40 AM, Nick Coghlan wrote:

...
Well, that and the beta deadline (we have to draw the line somewhere, or we'll be stuck in an eternal spiral of "just one more feature")

Guido wanted to get the beta out when we did exactly so we could draw this line in the sand. I'd much rather people be spending time getting what features we do have tested, stabilized, bug fixed, and turning the buildbots green across the board.

Having been caught by the 2.5 beta deadline with the changes that eventually became PEP 361 (and I think were significantly improved by the additional attention that was available due to the delay) I understand completely.

(And to everyone with features that get bumped to 2.7/3.1 because of this... while a number of you no doubt know this already, it really is astonishing how soon the next release seems to roll around, even with our fairly leisurely release schedules!)

I'd like to separate the concerns though. I personally don't want to see the feature in its current form added to 2.7/3.1 either. As others pointed out, it's a wart on the bin/oct/hex functions. So as far as the feature design goes, I offer some suggestions: a new module; or a new function in math; or a new method on float. Since Raymond is the champion for the feature let him choose the API from those alternatives. Regarding the addition to 2.6/3.0 post beta 1, I think a new module has the most chance of success, especially if it's written in Python in such a way as to need minimal changes between 2.6 and 3.0. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Nick Coghlan

9:17 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Guido van Rossum wrote:

...

I personally don't want to see the feature in its current form added to 2.7/3.1 either. As others pointed out, it's a wart on the bin/oct/hex functions.

So as far as the feature design goes, I offer some suggestions: a new module; or a new function in math; or a new method on float. Since Raymond is the champion for the feature let him choose the API from those alternatives.

Regarding the addition to 2.6/3.0 post beta 1, I think a new module has the most chance of success, especially if it's written in Python in such a way as to need minimal changes between 2.6 and 3.0.

One of the other reasons I'd like to postpone the feature is that I think with a clean design behind it it could be an elegant addition to those builtins rather than a wart. But helping Raymond to convince you or anyone else of that is well down my to-do list at the moment (which I think just got longer with the test_support discussion...) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Raymond Hettinger

11:37 p.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

From: "Guido van Rossum"

...

So as far as the feature design goes, I offer some suggestions: a new module; or a new function in math; or a new method on float. Since Raymond is the champion for the feature let him choose the API from those alternatives.

I choose bin/hex/oct methods on floatobjects. Will work-up a patch. Raymond

Guido van Rossum

27 Jun 27 Jun

12:20 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 11:07 AM, Raymond Hettinger wrote:

...

From: "Guido van Rossum"

...
So as far as the feature design goes, I offer some suggestions: a new module; or a new function in math; or a new method on float. Since Raymond is the champion for the feature let him choose the API from those alternatives.

I choose bin/hex/oct methods on floatobjects. Will work-up a patch.

Let's step back and discuss the API some more. - Do we need all three? - If so, why not .tobase(N)? (Even if N is restricted to 2, 8 and 16.) - What should the output format be? I know you originally favored 0b10101.010101 etc. Now that it's not overloaded on the bin/oct/hex builtins, the constraint that it needs to be an eval() able expression may be dropped (unless you see a use case for that too). -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Raymond Hettinger

1:22 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

From: "Guido van Rossum"

...

Let's step back and discuss the API some more.

- Do we need all three?

I think so -- see the the reasons below. Of course, my first choice was not on your list. To me, the one obvious way to convert a number to a eval-able string in a different base is to use bin(), oct(), or hex(). But that appears to be off the table for reasons that I've read but don't make any sense to me. It seems simple enough, extendable enough, and clean enough for bin/oct/hex to use __index__ if present and __float__ if not.

...

- If so, why not .tobase(N)? (Even if N is restricted to 2, 8 and 16.)

I don't think it's user-friendly to have the float-to-bin API fail to parallel the int-to-bin API. IMO, it should be done the same way in both places. I don't find it attractive in appearance. Any use case I can imagine involves multiple calls using the same base and I would likely end-up using functools.partial or somesuch to factor-out the repeated use of the same variable. In particular, it's less usable with a series of numbers at the interactive prompt. That is one of the primary use cases since it allows you to see exactly what is happening with float arithmetic:

...

...
...
.6 + .7 1.2999999999999998 bin(.6) '0b10011001100110011001100110011001100110011001100110011 * 2.0 ** -53' bin(.7) '0b1011001100110011001100110011001100110011001100110011 * 2.0 ** -52' bin(.6 + .7) '0b101001100110011001100110011001100110011001100110011 * 2.0 ** -50' bin(1.3) '0b10100110011001100110011001100110011001100110011001101 * 2.0 ** -52'

Or checking whether a number is exactly representable:

...

...
...
bin(3.375) '0b11011 * 2.0 ** -3'

Both of those bits of analysis become awkward with the tobase() method:

...

...
...
(.6).tobase(2) ...

...

- What should the output format be? I know you originally favored 0b10101.010101 etc. Now that it's not overloaded on the bin/oct/hex builtins, the constraint that it needs to be an eval() able expression may be dropped (unless you see a use case for that too).

The other guys convinced me that round tripping was important and that there is a good use case for being able to read/write precisely specified floats in a platform independent manner. Also, my original idea didn't scale well without exponential notation -- i.e. bin(125E-100) would have a heckofa lot of leading zeroes. Terry and Mark also pointed-out that the hex with exponential notation was the normal notation used in papers on floating point arithmetic. Lastly, once I changed over to the new way, it dramatically simplified the implementation. Raymond

Guido van Rossum

2:11 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 12:52 PM, Raymond Hettinger wrote:

...

From: "Guido van Rossum"

...
Let's step back and discuss the API some more.

- Do we need all three?

I think so -- see the the reasons below.

Sounds like Mark Dickinson only cares about bin and hex.

...

Of course, my first choice was not on your list. To me, the one obvious way to convert a number to a eval-able string in a different base is to use bin(), oct(), or hex(). But that appears to be off the table for reasons that I've read but don't make any sense to me. It seems simple enough, extendable enough, and clean enough for bin/oct/hex to use __index__ if present and __float__ if not.

That's not extendable to types that aren't int or float though. And it would accept Decimal instances which seems a really odd thing to do.

...

...
- If so, why not .tobase(N)? (Even if N is restricted to 2, 8 and 16.)

I don't think it's user-friendly to have the float-to-bin API fail to parallel the int-to-bin API. IMO, it should be done the same way in both places.

Consistency only goes so far. We have 0b, 0o and 0x notations for integers, and the bin/oct/hex builtins are meant to invert those. We don't have base-{2,8,16} literals for floats.

...

I don't find it attractive in appearance. Any use case I can imagine involves multiple calls using the same base and I would likely end-up using functools.partial or somesuch to factor-out the repeated use of the same variable. In particular, it's less usable with a series of numbers at the interactive prompt. That is one of the primary use cases since it allows you to see exactly what is happening with float arithmetic:

...
...
...
.6 + .7

1.2999999999999998

...
...
...
bin(.6)

'0b10011001100110011001100110011001100110011001100110011 * 2.0 ** -53'

...
...
...
bin(.7)

'0b1011001100110011001100110011001100110011001100110011 * 2.0 ** -52'

...
...
...
bin(.6 + .7)

'0b101001100110011001100110011001100110011001100110011 * 2.0 ** -50'

...
...
...
bin(1.3)

'0b10100110011001100110011001100110011001100110011001101 * 2.0 ** -52'

Or checking whether a number is exactly representable:

...
...
...
bin(3.375)

'0b11011 * 2.0 ** -3'

Both of those bits of analysis become awkward with the tobase() method:

...
...
...
(.6).tobase(2)

You don't need the parentheses around .6. I think much fewer than 0.01% of Python users will ever need this. It's a one-liner helper function if you prefer to say bin(x) instead of x.bin().

...

...
- What should the output format be? I know you originally favored 0b10101.010101 etc. Now that it's not overloaded on the bin/oct/hex builtins, the constraint that it needs to be an eval() able expression may be dropped (unless you see a use case for that too).

The other guys convinced me that round tripping was important and that there is a good use case for being able to read/write precisely specified floats in a platform independent manner.

Can you summarize those reasons? Who are the users of that feature? I'm still baffled why a feature whose only users are extreme experts needs to have such a prominent treatment. Surely there are a lot more Python users who call urlopen() or urlparse() all day long. Should these be built-in functions then?

...

Also, my original idea didn't scale well without exponential notation -- i.e. bin(125E-100) would have a heckofa lot of leading zeroes. Terry and Mark also pointed-out that the hex with exponential notation was the normal notation used in papers on floating point arithmetic. Lastly, once I changed over to the new way, it dramatically simplified the implementation.

I agree that you need to have a notation using an exponent. If it weren't for the roundtripping, I'd probably have preferred something which simply showed me the bits of the IEEE floating point number broken out into mantissa and exponent -- that seems more educational to me than normalizing things so that the last bit is nonzero. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Terry Reedy

3:30 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Raymond Hettinger wrote:

...

From: "Guido van Rossum"

...
Let's step back and discuss the API some more.

- Do we need all three?

I think so -- see the the reasons below.

I would prefer 1, see below.

...

Of course, my first choice was not on your list. To me, the one obvious way to convert a number to a eval-able string in a different base is to use bin(), oct(), or hex(). But that appears to be off the table for reasons that I've read but don't make any sense to me.

Let me try. I am one of those who prefer smaller to bigger for the core language to make it easier to learn and teach. But, to me, there deeper consideration that applies here. A Python interpreter, human or mechanical, must do exact integer arithmetic. But a Python interpreter does not have to convert float literals to fixed size binary and does *not* have to do float arithmetic with binary presentations that are usually approximations. (Indeed, human interpreters do neither, which is why they are often surprised at CPython's float output, and which is why this function will be useful.) If built-in functions are part of the language definition, as Guido just clarified, their definition and justification should not depend on the float implementation.

...

It seems simple enough, extendable enough, and clean enough for bin/oct/hex to use __index__ if present and __float__ if not.

To me, a binary representation, in whatever base, of a Decimal is senseless. The point of this issue is to reveal the exact binary bit pattern of float instances.

...

...
- If so, why not .tobase(N)? (Even if N is restricted to 2, 8 and 16.)

I don't think it's user-friendly to have the float-to-bin API fail to parallel the int-to-bin API. IMO, it should be done the same way in both places.

I would like to turn this around. I think that 3 nearly identical built-ins is 2 too many. I am going to propose on the Py3 list that bin, oct, and hex be condensed to one function, bin(integer, base=2,8,or16), for 3.1 if not 3.0. Base 8 and 16 are, to me, compressed binary. Three methods is definitely too many for a somewhat subsidiary function. So, I would like to see float.bin([base=2])

...

I don't find it attractive in appearance. Any use case I can imagine involves multiple calls using the same base and I would likely end-up using functools.partial or somesuch to factor-out the repeated use of the same variable.

Make the base that naive users want to see the default. I believe this to be 2. Numerical analysts who want base 16 can deal with partial if they really have scattered calls (as opposes to a few within loops) and cannot deal with typing '16' over and over.

...

...
...
...
bin(.6) '0b10011001100110011001100110011001100110011001100110011 * 2.0**-53' ... Both of those bits of analysis become awkward with the tobase() method: (.6).tobase(2)

Eliminate the unneeded parentheses and default value, and this is

...

...
...
.6.bin() which is just one extra char.

...

...
- What should the output format be? I know you originally favored 0b10101.010101 etc. Now that it's not overloaded on the bin/oct/hex builtins, the constraint that it needs to be an eval() able expression may be dropped (unless you see a use case for that too).

The other guys convinced me that round tripping was important and that there is a good use case for being able to read/write precisely specified floats in a platform independent manner.

Definitely. The paper I referenced in the issue discussion, http://bugs.python.org/issue3008 mentioned a few times here, is http://hal.archives-ouvertes.fr/docs/00/28/14/29/PDF/floating-point-article....

...

Also, my original idea didn't scale well without exponential notation -- i.e. bin(125E-100) would have a heckofa lot of leading zeroes. Terry and Mark also pointed-out that the hex with exponential notation was the normal notation used in papers on floating point arithmetic. Lastly, once I changed over to the new way, it dramatically simplified the implementation.

I originally thought I preferred the 'hexponential' notation that uses P for power instead of E for exponential. But with multiple bases, the redundancy of repeating the bases is ok, and being able to eval() without changing the parser is a plus. But I would prefer losing the spaces around the ** operator. Terry Jan Reedy

Mark Dickinson

3:41 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 11:00 PM, Terry Reedy wrote:

...

Definitely. The paper I referenced in the issue discussion, http://bugs.python.org/issue3008 mentioned a few times here, is http://hal.archives-ouvertes.fr/docs/00/28/14/29/PDF/floating-point-article....

Perhaps it's worth reproducing the most relevant paragraph of that paper (the end of section 2.1) here: """Conversion to and from decimal representation is delicate; special care must be taken in order not to introduce inaccuracies or discrepan- cies. [Steele and White, 1990, Clinger, 1990]. Because of this, C99 introduces hexadecimal floating-point literals in source code. [ISO, 1999, §6.4.4.2] Their syntax is as follows: 0xmmmmmm.mmmmp±ee where mmmmmm.mmmm is a mantissa in hexadecimal, possibly containing a point, and ee is an exponent, written in deci- mal, possibly preceded by a sign. They are interpreted as [mmmmmm.mmmm]16×2ee . Hexadecimal floating-point representations are especially important when val- ues must be represented exactly, for reproducible results — for instance, for testing "borderline cases" in algorithms. For this reason, we shall use them in this paper wherever it is important to specify exact values. See also Section 4.4 for more information on inputting and outputting floating-point values."""

Greg Ewing

6:41 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Raymond Hettinger wrote:

...

To me, the one obvious way to convert a number to a eval-able string in a different base is to use bin(), oct(), or hex().

What use cases are there for an eval-able representation of a float in those bases, as opposed to a human-readable one? -- Greg

Guido van Rossum

28 Jun 28 Jun

12:32 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Now that I've learned about the hex float format supported by C++ and Java, I wonder if it wouldn't be better to support conversion to and from that format and nothing else. E.g.

...

...
...
math.tohex(3.14) '0x1.91eb851eb851fp+1' math.fromhex('0x1.91eb851eb851fp+1') 3.1400000000000001

BTW I am still hoping to be able the output of the second command to just "3.14" but the tracker issue for that (http://bugs.python.org/issue1580) is stuck on trying to decide whether it's okay to have repr(<float>) occasionally return a string that doesn't convert to the exact same value on another platform. (My preferred approach would ensure that it does convert to the same value on the same platform, because that's how I'd compute it.) Perhaps the existance of an unambiguous hex format that is also interchangeable with Java and C (and presumably C++) would alleviate that concern. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum

12:34 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

PS. I can't get excited about having support for this in %-style format strings (and even less so now %a already means "call ascii()"). It would be easy enough to add support for it to float.__format__() though. On Fri, Jun 27, 2008 at 12:02 PM, Guido van Rossum wrote:

...

Now that I've learned about the hex float format supported by C++ and Java, I wonder if it wouldn't be better to support conversion to and from that format and nothing else.

E.g.

...
...
...
math.tohex(3.14) '0x1.91eb851eb851fp+1' math.fromhex('0x1.91eb851eb851fp+1') 3.1400000000000001

BTW I am still hoping to be able the output of the second command to just "3.14" but the tracker issue for that (http://bugs.python.org/issue1580) is stuck on trying to decide whether it's okay to have repr(<float>) occasionally return a string that doesn't convert to the exact same value on another platform. (My preferred approach would ensure that it does convert to the same value on the same platform, because that's how I'd compute it.) Perhaps the existance of an unambiguous hex format that is also interchangeable with Java and C (and presumably C++) would alleviate that concern.

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Mark Dickinson

3:24 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Fri, Jun 27, 2008 at 8:02 PM, Guido van Rossum wrote:

...

Now that I've learned about the hex float format supported by C++ and Java, I wonder if it wouldn't be better to support conversion to and from that format and nothing else.

E.g.

...
...
...
math.tohex(3.14) '0x1.91eb851eb851fp+1' math.fromhex('0x1.91eb851eb851fp+1') 3.1400000000000001

This would certainly be enough for me, though I think there's still some educational value in having binary output available. But that's just a matter of substituting a four-bit binary string for each hexadecimal digit (or learning to read hexadecimal as though it were binary). In fromhex, what would be done with a string that gives more hex digits than the machine precision can support? An obvious answer is just to round to the nearest float, but since part of the point of hex floats is having a way to specify a given value *exactly*, it might make more sense to raise an exception rather than changing the value by rounding it. Mark

Terry Reedy

4:29 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Mark Dickinson wrote:

...

...
...
...
...
math.tohex(3.14) '0x1.91eb851eb851fp+1' math.fromhex('0x1.91eb851eb851fp+1') 3.1400000000000001

How about just one self-inverse method .hex? .hex(float/hexstring) returns corresponding hexstring/float

Guido van Rossum

4:36 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Fri, Jun 27, 2008 at 3:59 PM, Terry Reedy wrote:

...

Mark Dickinson wrote:

...
...
...
...
...
math.tohex(3.14)

'0x1.91eb851eb851fp+1'

...
...
...
math.fromhex('0x1.91eb851eb851fp+1')

3.1400000000000001

How about just one self-inverse method .hex? .hex(float/hexstring) returns corresponding hexstring/float

That seems to be a misplaced attempt at economy, obscuring the intent from the reader. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum

4:34 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Fri, Jun 27, 2008 at 2:54 PM, Mark Dickinson wrote:

...

On Fri, Jun 27, 2008 at 8:02 PM, Guido van Rossum wrote:

...
Now that I've learned about the hex float format supported by C++ and Java, I wonder if it wouldn't be better to support conversion to and from that format and nothing else.

E.g.

...
...
...
math.tohex(3.14) '0x1.91eb851eb851fp+1' math.fromhex('0x1.91eb851eb851fp+1') 3.1400000000000001

This would certainly be enough for me, though I think there's still some educational value in having binary output available. But that's just a matter of substituting a four-bit binary string for each hexadecimal digit (or learning to read hexadecimal as though it were binary).

If it's educational it can be left as an exercise for the reader. :-)

...

In fromhex, what would be done with a string that gives more hex digits than the machine precision can support? An obvious answer is just to round to the nearest float, but since part of the point of hex floats is having a way to specify a given value *exactly*, it might make more sense to raise an exception rather than changing the value by rounding it.

Whatever Java and C99 do. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Mark Dickinson

4:20 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Fri, Jun 27, 2008 at 8:02 PM, Guido van Rossum wrote:

...

Now that I've learned about the hex float format supported by C++ and Java, I wonder if it wouldn't be better to support conversion to and from that format and nothing else.

By the way, this particular format is also recommended by the draft versions of IEEE 754r that I've seen: section 7.12.2 of draft version 1.2.5 (this is publicly available---there's a link from the wikipedia 754r page) says: """Implementations supporting binary formats shall provide conversions between all supported internal binary formats and external hexadecimal character sequences. External hexadecimal character sequences for finite numbers are of the form specified by C99 subclauses: 6.4.4.2 floating constants, 20.1.3 strtod, 7.19.6.2 fscanf (a, e, f, g), and 7.19.6.1 fprintf (a, A).""" More recent 754r drafts spell the grammar out explicitly instead of referring to C99, and weaken the 'shall' (i.e., 'is required to') to a 'should' ('is recommended to'). Mark

"Martin v. Löwis"

4:38 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

...

Now that I've learned about the hex float format supported by C++ and Java, I wonder if it wouldn't be better to support conversion to and from that format and nothing else.

I would be fine with that, and prefer it over the original change. Regards, Martin

"Martin v. Löwis"

27 Jun 27 Jun

1:25 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

...

...
So as far as the feature design goes, I offer some suggestions: a new module; or a new function in math; or a new method on float. Since Raymond is the champion for the feature let him choose the API from those alternatives.

I choose bin/hex/oct methods on floatobjects. Will work-up a patch.

I think the feature is misguided in the first place. Why do you want a hex representation of floating point numbers? Can't you use struct.pack for that? And, if bin/hex/oct are useful, why not base 6 (say)? Regards, Martin

Mark Dickinson

1:47 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 8:55 PM, "Martin v. Löwis" wrote:

...

I think the feature is misguided in the first place. Why do you want a hex representation of floating point numbers?

Answering for myself: because it gives an exact representation of a floating-point number in a fairly human-readable format.

...

Can't you use struct.pack for that?

struct.pack would only show the bit layout, leaving the user to manually extract the sign bit, exponent, and fraction, and then make sense of the whole thing. The proposed feature works at a higher level of abstraction, directly describing the *value* of the float rather than its bit layout. In particular, this means it'll make sense across platforms, regardless of variations in bit layout. And, if bin/hex/oct are useful, why not base

...

6 (say)?

I'd say that bin and hex are special: bin is natural because floats are usually thought of, and stored as, binary numbers. hex is special because it gives a compact way of representing a float, and because there's already a history of using hex floats in numerical analysis literature and in programming languages (C99, Java, ...) I have to admit that I can't see much use for octal floats. Mark

Guido van Rossum

1:58 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 1:17 PM, Mark Dickinson wrote:

...

I'd say that bin and hex are special: bin is natural because floats are usually thought of, and stored as, binary numbers. hex is special because it gives a compact way of representing a float, and because there's already a history of using hex floats in numerical analysis literature and in programming languages (C99, Java, ...)

Can you show us what APIs and output formats C99 and Java support? Maybe we can borrow something from there rather than reinventing the wheel? -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Mark Dickinson

2:25 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 9:28 PM, Guido van Rossum wrote:

...

Can you show us what APIs and output formats C99 and Java support? Maybe we can borrow something from there rather than reinventing the wheel?

Java's toHexString method is documented at: http://java.sun.com/javase/6/docs/api/java/lang/Double.html#toHexString(doub...) It's disadvantage from Python's point of view is that some features are IEEE 754 specific (e.g. treatment of subnormals, which don't exist for most other floating point types). C99s support for hex literals uses a similar format; the standard is less specific about the precise output format, but it's still of the form 0x1.<fraction>p<exponent> Incidentally, the funny 'p' for the exponent instead of 'e' is apparently there to avoid ambiguity in something like: 0x1e+3 Mark

Mark Dickinson

2:56 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 9:55 PM, Mark Dickinson wrote:

...

It's disadvantage from Python's point of view is that some features are IEEE 754

Aargh! I can't believe I wrote that. Its. Its. Its. Anyway; some more detail: Both C99 and Java 1.5/1.6 support hex floating-point literals; both in exactly the same format, as far as I can tell. Here are the relevant productions from the Java grammar: HexDigit: one of 0 1 2 3 4 5 6 7 8 9 a b c d e f A B C D E F HexNumeral: 0 x HexDigits 0 X HexDigits HexDigits: HexDigit HexDigit HexDigits HexadecimalFloatingPointLiteral: HexSignificand BinaryExponent FloatTypeSuffix_opt HexSignificand: HexNumeral HexNumeral . 0x HexDigits_opt . HexDigits 0X HexDigits_opt . HexDigits BinaryExponent: BinaryExponentIndicator SignedInteger BinaryExponentIndicator:one of p P Java's 'Double' class has a 'toHexString' method that outputs a valid hex floating point string, and the Double() constructor also accepts such strings. C99 also appears to have full support for input/output of hex floats; e.g. using strtod and printf('%a', ...). Not sure how helpful this is. Mark

"Martin v. Löwis"

2 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

...

I think the feature is misguided in the first place. Why do you want a hex representation of floating point numbers?

Answering for myself: because it gives an exact representation of a floating-point number in a fairly human-readable format.

Ok. But py> binascii.hexlify(struct.pack("d", 3.14)) '1f85eb51b81e0940' does that already, no? You won't know the precise value, but you won't know that with hex support, either.

...

struct.pack would only show the bit layout, leaving the user to manually extract the sign bit, exponent, and fraction, and then make sense of the whole thing.

I'd question that the user is able to make sense of a number when mantissa and exponent is represented in hex.

...

I'd say that bin and hex are special: bin is natural because floats are usually thought of, and stored as, binary numbers. hex is special because it gives a compact way of representing a float, and because there's already a history of using hex floats in numerical analysis literature and in programming languages (C99, Java, ...)

Then I'd argue that the feature should be symmetric: If there is support for printing floating point numbers as hex, there should also be support for hex floating point literals. Also, to follow C's tradition, it would be better if that was *not* integrated into the hex function (or a hex method), but if there was support for %a in string formatting. Regards, Martin

Mark Dickinson

2:16 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 9:30 PM, "Martin v. Löwis" wrote:

...

...
Answering for myself: because it gives an exact representation of a floating-point number in a fairly human-readable format.

Ok. But

py> binascii.hexlify(struct.pack("d", 3.14)) '1f85eb51b81e0940'

does that already, no? You won't know the precise value, but you won't know that with hex support, either.

The output from hex_float(3.14) would be something like: '0x1.91eb851eb851fp+1' The exponent is still usually given in decimal; there's no need for it to be hexadecimal for exactness. I'd question that the user is able to make sense of a number when

...

mantissa and exponent is represented in hex.

I think the above it still a bit easier to understand than if one has to figure out where the sign/exponent and exponent/fraction bit boundaries are, unbias the exponent, and add the extra hidden '1' bit into the mantissa. That's a lot of mental work.

...

Then I'd argue that the feature should be symmetric: If there is support for printing floating point numbers as hex, there should also be support for hex floating point literals.

I agree with this. Or at least support for hex floating point strings, if not literals.

...

Also, to follow C's tradition, it would be better if that was *not* integrated into the hex function (or a hex method), but if there was support for %a in string formatting.

I'd be delighted with '%a' support. Mark

Guido van Rossum

2:58 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 1:46 PM, Mark Dickinson wrote:

...

I'd be delighted with '%a' support.

Remind me what %a does? -- --Guido van Rossum (home page: http://www.python.org/~guido/)

"Martin v. Löwis"

3:17 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Guido van Rossum wrote:

...

On Thu, Jun 26, 2008 at 1:46 PM, Mark Dickinson wrote:

...
I'd be delighted with '%a' support.

Remind me what %a does?

It's a C99 feature. From the spec (7.19.6.1p8) a,A A double argument representing a floating-point number is converted in the style [-]0xh.hhhhp±d, where there is one hexadecimal digit (which is nonzero if the argument is a normalized floating- point number and is otherwise unspecified) before the decimal-point character235) and the number of hexadecimal digits after it is equal to the precision; if the precision is missing and FLT_RADIX is a power of 2, then the precision is sufficient for an exact representation of the value; if the precision is missing and FLT_RADIX is not a power of 2, then the precision is sufficient to distinguish236) values of type double, except that trailing zeros may be omitted; if the precision is zero and the # flag is not specified, no decimal- point character appears. The letters abcdef are used for a conversion and the letters ABCDEF for A conversion. The A conversion specifier produces a number with X and P instead of x and p. The exponent always contains at least one digit, and only as many more digits as necessary to represent the decimal exponent of 2. If the value is zero, the exponent is zero. A double argument representing an infinity or NaN is converted in the style of an f or F conversion specifier. Footnotes 235) Binary implementations can choose the hexadecimal digit to the left of the decimal-point character so that subsequent digits align to nibble (4-bit) boundaries. 236) The precision p is sufficient to distinguish values of the source type if 16p-1>bn where b is FLT_RADIX and n is the number of base-b digits in the significand of the source type. A smaller p might suffice depending on the implementation's scheme for determining the digit to the left of the decimal-point character. This is symmetric with C99's hexadecimal floating point literals: hexadecimal-floating-constant: hexadecimal-prefix hexadecimal-fractional-constant binary-exponent-part floating-suffix-opt hexadecimal-prefix hexadecimal-digit-sequence binary-exponent-part floating-suffix-opt hexadecimal-fractional-constant: hexadecimal-digit-sequence-opt . hexadecimal-digit-sequence hexadecimal-digit-sequence . binary-exponent-part: p sign-opt digit-sequence P sign-opt digit-sequence hexadecimal-digit-sequence: hexadecimal-digit hexadecimal-digit-sequence hexadecimal-digit scanf and strtod support the same format. Regards, Martin

Mark Dickinson

3:20 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 10:28 PM, Guido van Rossum wrote:

...

Remind me what %a does?

From the C99 standard (section 7.19.6.1): A double argument representing a ﬂoating-point number is converted in the style [−]0xh.hhhhp±d, [...]

Georg Brandl

3:30 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Mark Dickinson schrieb:

...

On Thu, Jun 26, 2008 at 10:28 PM, Guido van Rossum wrote:

...
Remind me what %a does?

From the C99 standard (section 7.19.6.1):

A double argument representing a ﬂoating-point number is converted in the style [−]0xh.hhhhp±d, [...]

Let me remind you that %a currently means "call ascii()" in 3.0. Georg

Mark Dickinson

3:31 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 11:00 PM, Georg Brandl wrote:

...

Let me remind you that %a currently means "call ascii()" in 3.0.

Oh well. That's out then. I'll rephrase to "I'd be delighted with something similar in spirit to '%a' support." :-) Mark

Eric Smith

4:07 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

Mark Dickinson wrote:

...

On Thu, Jun 26, 2008 at 11:00 PM, Georg Brandl wrote:

...
Let me remind you that %a currently means "call ascii()" in 3.0.

Oh well. That's out then. I'll rephrase to "I'd be delighted with something similar in spirit to '%a' support." :-)

It could be added to str.format(). Well, actually float.__format__(). Not that I'm advocating it, but it's a place it would fit, and since it's restricted to the float format specifier, it would at least be well contained. And other types are free to implement it, or not.

Raymond Hettinger

3 a.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

[MvL]

...

...
Then I'd argue that the feature should be symmetric: If there is support for printing floating point numbers as hex, there should also be support for hex floating point literals.

[Mark]

...

I agree with this. Or at least support for hex floating point strings, if not literals.

ISTM, that the currently proposed output format gives us this benefit for free (no changes to the parser). The format is already close to the C99 notation but replaces the 'p' with '* 2.0 **' which I find to be both readable and self-explanatory. Raymond

Mark Dickinson

3:23 a.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

On Thu, Jun 26, 2008 at 10:30 PM, Raymond Hettinger wrote:

...

ISTM, that the currently proposed output format gives us this benefit for free (no changes to the parser). The format is already close to the C99 notation but replaces the 'p' with '* 2.0 **' which I find to be both readable and self-explanatory.

There's one other major difference between the C99 notation and the current patch: the C99 notation includes a (hexa)decimal point. The advantages of this include: - the exponent gives a rough idea of the magnitude of the number, and - the exponent doesn't vary with changes to the least significant bits of the float. The disadvantage is the loss of evalability. (Is that a word?) Mark

"Martin v. Löwis"

3:27 a.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

...

The disadvantage is the loss of evalability. (Is that a word?)

Until the parser has support for it, having a float class method, or even the float callable itself for conversion seems reasonable. If repr() didn't produce it, eval() doesn't need to understand it. Regards, Martin

Raymond Hettinger

29 Jun 29 Jun

5:16 a.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

From: "Mark Dickinson"

...

There's one other major difference between the C99 notation and the current patch: the C99 notation includes a (hexa)decimal point. The advantages of this include:

- the exponent gives a rough idea of the magnitude of the number, and - the exponent doesn't vary with changes to the least significant bits of the float.

Is everyone agreed on a tohex/fromhex pair using the C99 notation as recommended in 754R? Are you thinking of math module functions or as a method and classmethod on floats? Raymond

Alex Martelli

7:42 a.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

On Sat, Jun 28, 2008 at 4:46 PM, Raymond Hettinger wrote:

...

From: "Mark Dickinson"

...
There's one other major difference between the C99 notation and the current patch: the C99 notation includes a (hexa)decimal point. The advantages of this include:

- the exponent gives a rough idea of the magnitude of the number, and - the exponent doesn't vary with changes to the least significant bits of the float.

Is everyone agreed on a tohex/fromhex pair using the C99 notation as recommended in 754R?

Dunno about everyone, but I'm +1 on that.

...

Are you thinking of math module functions or as a method and classmethod on floats?

I'd prefer math modules functions. Alex

Mark Dickinson

4 Jul 4 Jul

3:09 p.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

On Sun, Jun 29, 2008 at 3:12 AM, Alex Martelli wrote:

...

On Sat, Jun 28, 2008 at 4:46 PM, Raymond Hettinger wrote:

...
Is everyone agreed on a tohex/fromhex pair using the C99 notation as recommended in 754R?

Dunno about everyone, but I'm +1 on that.

...
Are you thinking of math module functions or as a method and classmethod on floats?

I'd prefer math modules functions.

I'm halfway through implementing this as a pair of float methods. Are there compelling reasons to prefer math module functions over float methods, or vice versa? Personally, I'm leaning slightly towards float methods: for me, these conversions are important enough to belong in the core language. But I don't have strong feelings either way. Mark

Guido van Rossum

7:19 p.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

Float methods are fine. On Fri, Jul 4, 2008 at 2:39 AM, Mark Dickinson wrote:

...

On Sun, Jun 29, 2008 at 3:12 AM, Alex Martelli wrote:

...
On Sat, Jun 28, 2008 at 4:46 PM, Raymond Hettinger wrote:

...
Is everyone agreed on a tohex/fromhex pair using the C99 notation as recommended in 754R?

Dunno about everyone, but I'm +1 on that.

...
Are you thinking of math module functions or as a method and classmethod on floats?

I'd prefer math modules functions.

I'm halfway through implementing this as a pair of float methods. Are there compelling reasons to prefer math module functions over float methods, or vice versa?

Personally, I'm leaning slightly towards float methods: for me, these conversions are important enough to belong in the core language. But I don't have strong feelings either way.

Mark _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Mark Dickinson

30 Jun 30 Jun

5:56 a.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

On Sun, Jun 29, 2008 at 12:46 AM, Raymond Hettinger wrote:

...

Is everyone agreed on a tohex/fromhex pair using the C99 notation as recommended in 754R?

Sounds good to me. I've attached a Python version of a possible implementation to the issue. See: http://bugs.python.org/file10780/hex_float.py It might be useful for testing. Mark

Guido van Rossum

9:23 p.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

FWIW, I'm fine with making these methods on float -- a class method float.fromhex(...) echoes e.g. dict.fromkeys(...) and datetime.fromordinal(...). The to-hex conversion could be x.hex() -- we don't tend to use ".toxyz()" as a naming convention much in Python. On Sun, Jun 29, 2008 at 5:26 PM, Mark Dickinson wrote:

...

On Sun, Jun 29, 2008 at 12:46 AM, Raymond Hettinger wrote:

...
Is everyone agreed on a tohex/fromhex pair using the C99 notation as recommended in 754R?

Sounds good to me.

I've attached a Python version of a possible implementation to the issue. See:

http://bugs.python.org/file10780/hex_float.py

It might be useful for testing.

Mark _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org

-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Mark Dickinson

10:01 p.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

On Mon, Jun 30, 2008 at 4:53 PM, Guido van Rossum wrote:

...

FWIW, I'm fine with making these methods on float -- a class method float.fromhex(...) echoes e.g. dict.fromkeys(...) and datetime.fromordinal(...). The to-hex conversion could be x.hex() -- we don't tend to use ".toxyz()" as a naming convention much in Python.

Would it be totally outrageous for the float constructor to accept hex strings directly? Mark

Guido van Rossum

1 Jul 1 Jul

4:42 a.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

Mon, Jun 30, 2008 at 9:31 AM, Mark Dickinson wrote:

...

On Mon, Jun 30, 2008 at 4:53 PM, Guido van Rossum wrote:

...
FWIW, I'm fine with making these methods on float -- a class method float.fromhex(...) echoes e.g. dict.fromkeys(...) and datetime.fromordinal(...). The to-hex conversion could be x.hex() -- we don't tend to use ".toxyz()" as a naming convention much in Python.

Would it be totally outrageous for the float constructor to accept hex strings directly?

int('0x10') raises a ValueError as well. You might propose float('0x...p...', 16) but since the format is so specifically different I think that's not completely kosher. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Steven D'Aprano

27 Jun 27 Jun

5:04 a.m.

New subject: [Python-checkins] r64424 - inpython/trunk:Include/object.h Lib/test/test_sys.pyMisc/NEWSObjects/intobject.c Objects/longobject.cObjects/typeobject.cPython/bltinmodule.c

On Fri, 27 Jun 2008 07:30:43 am Raymond Hettinger wrote:

...

The format is already close to the C99 notation but replaces the 'p' with '* 2.0 **' which I find to be both readable and self-explanatory.

Since we're talking about what's "readable and self-explanatory", I find that jarring, unexpected, unintuitive, and mathematically bizarre (even if it is the convention in some areas). It's like writing '123 * A.0 ** -2' for 1.23. And putting spaces around the operators is ugly. I'd like to mention that what bin() et al is actually doing is not so much returning a binary number string but returning a hybrid binary/decimal arithmetic expression. So bin() returns a binary number string for int arguments, and an expression for float arguments: these are conceptually different kinds of things, even if they're both strings. Frankly, I'd be much happier if the API (whatever it is) returned a tuple of (binary string, base int, exponent int), and let users write their own helper function to format it any way they like. Or failing that, the p notation used by Java and C99. (And yes, mixing decimal exponents with binary mantissas upsets me too, but somehow it's less upsetting.) -- Steven

"Martin v. Löwis"

3:05 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

...

I think the above it still a bit easier to understand than if one has to figure out where the sign/exponent and exponent/fraction bit boundaries are, unbias the exponent, and add the extra hidden '1' bit into the mantissa. That's a lot of mental work.

Sure. However, I'd argue that most people are unable to even remotely guess the number in decimal when presented with the number in hexadecimal - at a minimum, they'll fail to recognize that the exponent is not to the power of 10, or when they realize that, might guess that it is to the power of 16 (I find it fairly confusing, but consequential, that it is to the power of 2, even when the digits are hex - and then, the exponent is decimal :-). So I'd like to dismiss any objective of "the output must be human-understandable" as unrealistic. That an unambiguous representation is desired, I can understand - but only if there also is a way to enter the same representation elsewhere. In addition, I fail to see the point in binary representation. For unambiguous representation, it's sufficient to use hex. I can't accept claims that people will be actually able to understand what number is represented when given the bit string. For educational purposes, decoding mantissa and biased exponent directly out of the IEEE representation is better than having binary output builtin.

...

Also, to follow C's tradition, it would be better if that was *not* integrated into the hex function (or a hex method), but if there was support for %a in string formatting.

I'd be delighted with '%a' support.

I personally find that much less problematic than extending the hex, and wouldn't object to a patch providing such formatting (assuming it does the same thing that C99 printf does). Regards, Martin

Raymond Hettinger

2:01 a.m.

New subject: [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

[Mark Dickinson]

...

I have to admit that I can't see much use for octal floats.

Neither do I. They look weird to me. Raymond

5769

Age (days ago)

5778

Last active (days ago)

List overview

Download

58 comments

14 participants

participants (14)

"Martin v. Löwis"
Alex Martelli
Antoine Pitrou
Barry Warsaw
Eric Smith
Georg Brandl
Greg Ewing
Guido van Rossum
Mark Dickinson
Nick Coghlan
Paul Moore
Raymond Hettinger
Steven D'Aprano
Terry Reedy

Re: [Python-Dev] [Python-checkins] r64424 - in python/trunk:Include/object.h Lib/test/test_sys.py Misc/NEWSObjects/intobject.c Objects/longobject.c Objects/typeobject.cPython/bltinmodule.c

tags

participants (14)