Hi All,
Recently some discussion began in the issue 3132 thread ( http://bugs.python.org/issue3132) regarding implementation of the new struct string syntax for PEP 3118. Mark Dickinson suggested that I bring the discussion on over to Python Dev. Below is a summary of the questions\comments from the thread.
Unpacking a long-double ===================
1. Should this return a Decimal object or a ctypes 'long double'? 2. Using ctypes 'long double' is easier to implement, but precision is lost when needing to do arithmetic, since the value for cytpes 'long double' is converted to a Python float. 3. Using Decimal keeps the desired precision, but the implementation would be non-trivial and architecture specific (unless we just picked a fixed number of bytes regardless of the architecture). 4. What representation should be used for standard size and alignment? IEEE 754 extended double precision?
Pointers ======
1. What is a specific pointer? For example, is '&d' is a pointer to a double? 2. How would unpacking a pointer to a Python Object work out? Given an address how would the appropriate object to be unpacked be determined? 3. Can pointers be nested, e.g. '&&d' ? 4. For the 'X{}' format (pointer to a function), is this supposed to mean a Python function or a C function?
String Syntax ==========
The syntax seems to have transcended verbal description. I think we need to put forth a grammar. There are also some questions regarding nesting levels and mixing specifiers that could perhaps be answered more clearly by having a grammar:
1. What nesting level can structures have? Arbitrary? 2. The new array syntax claims "multi-dimensional array of whatever follows". Truly whatever? Arrays of structures? Arrays of pointers? 3. How do array specifiers and pointer specifiers mix? For example, would '(2, 2)&d' be a two-by-two array of pointers to doubles? What about '&(2, 2)d'? Is this a pointer to an two-by-two array of doubles?
An example grammar is contained in a diff against the PEP attached to this mail. NOTE: I am *not* actually submitting a patch against the PEP. This was just the clearest way to present the example grammar.
Use Cases ========
1. What are the real world use cases for these struct string extensions? These should be fleshed out and documented.
-- Meador
On approximately 2/25/2010 8:51 PM, came the following characters from the keyboard of Meador Inge:
Hi All,
Recently some discussion began in the issue 3132 thread (http://bugs.python.org/issue3132) regarding implementation of the new struct string syntax for PEP 3118. Mark Dickinson suggested that I bring the discussion on over to Python Dev. Below is a summary of the questions\comments from the thread.
Unpacking a long-double
- Should this return a Decimal object or a ctypes 'long double'?
- Using ctypes 'long double' is easier to implement, but precision is lost when needing to do arithmetic, since the value for cytpes
'long double' is converted to a Python float. 3. Using Decimal keeps the desired precision, but the implementation would be non-trivial and architecture specific (unless we just picked a fixed number of bytes regardless of the architecture). 4. What representation should be used for standard size and alignment? IEEE 754 extended double precision?
Because of 2 (lossy, dependency), and 3 (non-trivial, architecture specific), neither choice in 1 seems appropriate.
Because of the nature of floats, because the need for manipulation may vary between applications, and because the required precision may vary between applications, I would recommend adding a "CLongDoubleStructWrapper" class (a better name would be welcome), which would copy the architecture-specific byte-stream and preserve it. If converted back to a struct, it would be lossless. If manipulation is required, the class could have converters to Python float (lossy), and Decimal of user-specifiable precision (punt the precision question to the application developer, who should know the needs of the application, and the expected platforms).
It might be reasonable to handle double and float similarly, at least as an option. On the other hand, if there can be options, perhaps they could be given when supplying the struct string syntax.... except the application may only wish to manipulate a few of the long double values, and converting the others would be wasteful.
Meador Inge schrieb:
Hi All,
Recently some discussion began in the issue 3132 thread ( http://bugs.python.org/issue3132) regarding implementation of the new struct string syntax for PEP 3118. Mark Dickinson suggested that I bring the discussion on over to Python Dev. Below is a summary of the questions\comments from the thread.
Unpacking a long-double
- Should this return a Decimal object or a ctypes 'long double'?
- Using ctypes 'long double' is easier to implement, but precision is lost when needing to do arithmetic, since the value for cytpes 'long
double' is converted to a Python float. 3. Using Decimal keeps the desired precision, but the implementation would be non-trivial and architecture specific (unless we just picked a fixed number of bytes regardless of the architecture). 4. What representation should be used for standard size and alignment? IEEE 754 extended double precision?
A variant of 2. would be to unpack into a ctypes 'long double', and extend the ctypes 'long double' type to retrive the value as Decimal instance, in addition to the default conversion into a Python float.
On Fri, Feb 26, 2010 at 4:08 PM, Greg Ewing greg.ewing@canterbury.ac.nzwrote:
Meador Inge wrote:
- Using Decimal keeps the desired precision,
Well, sort of, but then you end up doing arithmetic in decimal instead of binary, which could give different results.
Even with the user-defined precision capabilities of the 'Decimal' class? In other words, can I create an instance of a 'Decimal' that behaves (in all operations: arithmetic, comparison, etc...) exactly as the extended double precision type offered by a given machine?
Maybe the solution is to give ctypes long double objects
the ability to do arithmetic?
Maybe, but then we would have to give all numeric 'ctypes' the ability to do arithmetic -- which may be more than we want.
-- Meador
Meador Inge schrieb:
On Fri, Feb 26, 2010 at 4:08 PM, Greg Ewing greg.ewing@canterbury.ac.nzwrote:
Meador Inge wrote:
- Using Decimal keeps the desired precision,
Well, sort of, but then you end up doing arithmetic in decimal instead of binary, which could give different results.
Even with the user-defined precision capabilities of the 'Decimal' class? In other words, can I create an instance of a 'Decimal' that behaves (in all operations: arithmetic, comparison, etc...) exactly as the extended double precision type offered by a given machine?
Maybe the solution is to give ctypes long double objects
the ability to do arithmetic?
Maybe, but then we would have to give all numeric 'ctypes' the ability to do arithmetic -- which may be more than we want.
See issue 887237:
http://bugs.python.org/issue887237
On Sat, Feb 27, 2010 at 11:20 AM, Thomas Heller theller@ctypes.org wrote:
See issue 887237:
Thanks for the link Thomas. Since there is already interest in adding arithmetic to ctypes, perhaps that is an option. One question that raises in my mind, though, is should only 'long double' unpack to a ctype in that case? Or should all items unpack to ctypes now? It seems to me that you would want everything to unpack to types from the same family (e.g. Python builtins or ctypes). This seems conceptually cleaner and the interoperability between types in the same "family" are (or can be in the case of modifying ctypes) more clearly defined.
Thanks,
-- Meador
Meador Inge wrote:
Even with the user-defined precision capabilities of the 'Decimal' class? In other words, can I create an instance of a 'Decimal' that behaves (in all operations: arithmetic, comparison, etc...) exactly as the extended double precision type offered by a given machine?
It's not precision that's the issue, it's that the number base is different. That affects which numbers can be represented exactly, and how results that can't be represented exactly are rounded.
I would be very surprised if there is a way of configuring the Decimal type so that it gives identical results to that of any IEEE binary floating point type, including rounding behaviour, denormalisation, etc.
On Sun, Feb 28, 2010 at 1:39 AM, Greg Ewing greg.ewing@canterbury.ac.nz wrote:
Meador Inge wrote:
Even with the user-defined precision capabilities of the 'Decimal' class? In other words, can I create an instance of a 'Decimal' that behaves (in all operations: arithmetic, comparison, etc...) exactly as the extended double precision type offered by a given machine?
It's not precision that's the issue, it's that the number base is different. That affects which numbers can be represented exactly, and how results that can't be represented exactly are rounded.
I would be very surprised if there is a way of configuring the Decimal type so that it gives identical results to that of any IEEE binary floating point type, including rounding behaviour, denormalisation, etc.
I'd be astonished. :)
Mark
On Fri, Feb 26, 2010 at 4:51 AM, Meador Inge meadori@gmail.com wrote:
Recently some discussion began in the issue 3132 thread (http://bugs.python.org/issue3132) regarding implementation of the new struct string syntax for PEP 3118. Mark Dickinson suggested that I bring the discussion on over to Python Dev. Below is a summary of the questions\comments from the thread.
Thanks for bringing this up here!
Unpacking a long-double
...
I don't want to dwell on this too much, since this is just one small part of the proposed struct module additions, and I'd like to get answers to some of the other questions you raised. If interested parties (Carl?, Travis?) have time to comment on the other questions Meador raised, I'd *really* appreciate it!
For long-double: I'm essentially -1 on any Decimal involvement here, on the basis that the semantics are messy and I doubt that the resulting behaviour would match users' needs or expectations. I'm also not too keen on introducing a long-double-with-arithmetic type (wherever it ends up living); I think having a single (binary) floating-point type has served Python very well, and adding other binary float precisions would risk either adding significant complexity, or leaving the extended precision type somewhat crippled. If there were a long double type that supported arithmetic, how should mixed-mode double + long double operations behave? What about conversions of long doubles to and from decimal strings? Should math module and cmath module functions accept long double arguments, and if so, what type result should they produce? The fact that there are at least 3 different common formats, before considering padding and byte order, for long double (IEEE 80-bit extended used on x86, IEEE binary128, and the so-called double-double format) isn't going to make things any easier here.
I guess, in my currently not-very-informed state, I'd vote for:
- packing and unpacking with the long double format expects and produces a ctypes.long_double in native mode; the long-double format would match that of the platform. - non-native mode packing and unpacking aren't permitted, and raise an exception. - to do arithmetic with long doubles, users would simply have to convert to and from Python floats, accepting the loss of precision that such conversion entails.
But I think I'm failing to understand the intended use-cases for this (and other) additions. Would it be possible for Carl or Travis, or anyone else, to give some examples of situations where this would be useful? I don't doubt that such situations exist; I'm just not sure what they are.
Pointers
[...] 2. How would unpacking a pointer to a Python Object work out? Given an address how would the appropriate object to be unpacked be determined?
Again, I'd really like to see some examples of how/when packing and unpacking pointers to Python objects would be used; I'm currently feeling too stupid to understand how this might work in practice.
- For the 'X{}' format (pointer to a function), is this supposed to mean a
Python function or a C function?
Ditto for this; assuming that we're talking about a Python function here.
String Syntax
The syntax seems to have transcended verbal description. I think we need to put forth a grammar.
Agreed: a clear specification is needed.
Use Cases
1. What are the real world use cases for these struct string extensions? These should be fleshed out and documented.
+many.
Mark
On Fri, Feb 26, 2010 at 1:51 PM, Meador Inge meadori@gmail.com wrote:
Hi All,
Recently some discussion began in the issue 3132 thread (http://bugs.python.org/issue3132) regarding implementation of the new struct string syntax for PEP 3118. Mark Dickinson suggested that I bring the discussion on over to Python Dev. Below is a summary of the questions\comments from the thread.
Unpacking a long-double
- Should this return a Decimal object or a ctypes 'long double'?
- Using ctypes 'long double' is easier to implement, but precision is
lost when needing to do arithmetic, since the value for cytpes 'long double' is converted to a Python float. 3. Using Decimal keeps the desired precision, but the implementation would be non-trivial and architecture specific (unless we just picked a fixed number of bytes regardless of the architecture). 4. What representation should be used for standard size and alignment? IEEE 754 extended double precision?
I think supporting even basic arithmetic correctly for long double would be a tremendous amount of work in python. First, as you know, there are many different formats which depend not only on the CPU but also on the OS and the compiler, but there are quite a few issues which are specific to long double (like converting to an integer which cannot fit in any C integer type on most implementations).
Also, IEEE 754 does not define any alignment as far as I know, that's up to the CPU implementer I believe. In Numpy, long double usually maps to either 12 bytes (np.float96) or 16 bytes (np.float128).
I would expect the long double to be mostly useful for data exchange - if you want to do arithmetic on long double, then the user of the buffer protocol would have to implement it by himself (like NumPy does ATM). So the important thing is to have enough information to use the long double: alignment and size are not enough.
cheers,
David