Am Montag, 10. März 2014 14:53:42 UTC+1 schrieb Stefan Krah:
[My apologies for being terse, I don't have much time to follow this discussion right now.]
Nick Coghlan ncoghlan@gmail.com wrote:
I think users of decimal literals will just need to deal with the risk of unexpected rounding, as the alternatives are even more problematic.
That is why I think we should seriously consider moving to IEEE semantics for a decimal literal. Among other things:
While I find this discussion about decimal literals extremely interesting, in my opinion, such a literal should have an underlying completely new numerical type, if it is really supposed to be for inexperienced users.
Most of the discussions right now concern rounding issues that occur after the decimal point, but I think an at least equally big problem is rounding to the left of it as in (using current float):
1e50 + 1000 1e+50
Importantly, Decimal is no cure here:
Decimal(10**50) + Decimal(100) Decimal('1.000000000000000000000000000E+50')
(of course, you can debate context settings to make this particular example work, but, in general, it happens with big enough numbers.)
The solution for this example is using ints of course:
10**50 + 100 100000000000000000000000000000000000000000000000100
, but obviously this works only for whole numbers, so there currently is no built-in way to make this example work correctly:
10**50 - 9999999999999999.5 1e+50
(or with Decimal:
Decimal(10**50) - Decimal('9999999999999999.5') Decimal('1.000000000000000000000000000E+50') ).
If we are discussing a new number literal I would like to have it cope with this and my suggestion is a new numeric type that's sort of a hybrid between int and either Decimal or float, i.e., a type that behaves like int for digits left of the decimal point, but may truncate to the right. In pure Python, something similar (but not a literal form of course) could be implemented as a class that stores the left digits as an int and the right digits as a float/Decimal internally. Calculations involving this class would be slow due to the constant need of shifting digits between the integer´and the float part, and might be too slow for many users even when written in C, but its behavior would meet the expectation of inexperienced people better than the existing types. Going back to Mark Harris' initial proposal of unifying numeric types (which I am not trying to support in any way here), such a type would even allow to unify int and float since an ints could be considered a subset of the new type with a fractional part of zero.
Cheers, Wolfgang
On Mon, Mar 10, 2014 at 10:48 AM, Wolfgang Maier wolfgang.maier@biologie.uni-freiburg.de wrote:
Am Montag, 10. März 2014 14:53:42 UTC+1 schrieb Stefan Krah:
[My apologies for being terse, I don't have much time to follow this discussion right now.]
Nick Coghlan ncoghlan@gmail.com wrote:
I think users of decimal literals will just need to deal with the risk of unexpected rounding, as the alternatives are even more problematic.
That is why I think we should seriously consider moving to IEEE semantics for a decimal literal. Among other things:
While I find this discussion about decimal literals extremely interesting, in my opinion, such a literal should have an underlying completely new numerical type, if it is really supposed to be for inexperienced users.
Most of the discussions right now concern rounding issues that occur after the decimal point, but I think an at least equally big problem is rounding to the left of it as in (using current float):
1e50 + 1000 1e+50
Importantly, Decimal is no cure here:
Decimal(10**50) + Decimal(100) Decimal('1.000000000000000000000000000E+50')
(of course, you can debate context settings to make this particular example work, but, in general, it happens with big enough numbers.)
The solution for this example is using ints of course:
10**50 + 100 100000000000000000000000000000000000000000000000100
, but obviously this works only for whole numbers, so there currently is no built-in way to make this example work correctly:
10**50 - 9999999999999999.5 1e+50
(or with Decimal:
Decimal(10**50) - Decimal('9999999999999999.5') Decimal('1.000000000000000000000000000E+50') ).
If we are discussing a new number literal I would like to have it cope with this and my suggestion is a new numeric type that's sort of a hybrid between int and either Decimal or float, i.e., a type that behaves like int for digits left of the decimal point, but may truncate to the right. In pure Python, something similar (but not a literal form of course) could be implemented as a class that stores the left digits as an int and the right digits as a float/Decimal internally. Calculations involving this class would be slow due to the constant need of shifting digits between the integer´and the float part, and might be too slow for many users even when written in C, but its behavior would meet the expectation of inexperienced people better than the existing types. Going back to Mark Harris' initial proposal of unifying numeric types (which I am not trying to support in any way here), such a type would even allow to unify int and float since an ints could be considered a subset of the new type with a fractional part of zero.
I think what you're proposing here is variant of fixed-point numbers. The representation you seem to be looking for is an arbitrary-precision integer plus an exponent. A question for you: how would you treat results like 1/3?
-- --Guido van Rossum (python.org/~guido)
Guido van Rossum guido@... writes:
I think what you're proposing here is variant of fixed-point numbers. The representation you seem to be looking for is an arbitrary-precision integer plus an exponent. A question for you: how would you treat results like 1/3?
No, I don't think this is what I'm proposing. I said digits left of the decimal point should behave like a Python integer, i.e., have arbitrary precision, but to the right of it things should look like a float or Decimal with fixed precision. So 1/3 would just look like float(1/3) for example.
Best, Wolfgang
On Mon, Mar 10, 2014 at 11:03 AM, Wolfgang Maier wolfgang.maier@biologie.uni-freiburg.de wrote:
Guido van Rossum guido@... writes:
> >
I think what you're proposing here is variant of fixed-point numbers. The representation you seem to be looking for is an arbitrary-precision integer plus an exponent. A question for you: how would you treat results like 1/3?
No, I don't think this is what I'm proposing. I said digits left of the decimal point should behave like a Python integer, i.e., have arbitrary precision, but to the right of it things should look like a float or Decimal with fixed precision. So 1/3 would just look like float(1/3) for example.
I'm sorry, I still don't understand your proposal. Suppose I have two numbers following that description and I divide them, and suppose the result cannot be represented as a decimal or binary fraction. What is your algorithm for deciding how many digits after the decimal point to keep? (Or how many bits after the binary point -- are you proposing a binary or a decimal representation?) Is the number of bits/digits to keep a constant, or does it vary by how many digits/bits are to the left of the point? And what about multiplications? Should they always produce an exact result or should they truncate if the result requires more digits/bits behind the point than a certain number? (If the latter, how is that limit determined?)
-- --Guido van Rossum (python.org/~guido)
On 03/10/2014 11:09 AM, Guido van Rossum wrote: >
I'm sorry, I still don't understand your proposal.
I think he's saying make a new type similarly to complex, only instead of two floats to make a complex number, have a (long) int and a decimal float to make this new type. The long int portion would have infinite precision, the float portion would have, say, 16 digits (or whatever).
-- ~Ethan~
On Tue, Mar 11, 2014 at 7:57 AM, Ethan Furman ethan@stoneleaf.us wrote:
I think he's saying make a new type similarly to complex, only instead of two floats to make a complex number, have a (long) int and a decimal float to make this new type. The long int portion would have infinite precision, the float portion would have, say, 16 digits (or whatever).
That's plausible as a representation, and looks tempting, but basic arithmetic operations become more complicated. Addition and subtraction just need to worry about carries, but multiplication forks out into four multiplications (intint, intfrac, fracint, fracfrac), and division becomes similarly complicated. Would it really be beneficial?
ChrisA
On Mon, Mar 10, 2014 at 2:51 PM, Chris Angelico rosuav@gmail.com wrote:
On Tue, Mar 11, 2014 at 7:57 AM, Ethan Furman ethan@stoneleaf.us wrote:
I think he's saying make a new type similarly to complex, only instead of two floats to make a complex number, have a (long) int and a decimal float to make this new type. The long int portion would have infinite precision, the float portion would have, say, 16 digits (or whatever).
That's plausible as a representation, and looks tempting, but basic arithmetic operations become more complicated. Addition and subtraction just need to worry about carries, but multiplication forks out into four multiplications (intint, intfrac, fracint, fracfrac), and division becomes similarly complicated. Would it really be beneficial?
It looks neither plausible nor tempting to me at all, and I hope that's not what he meant. It can represent numbers of any magnitude that have lots of zeros following the decimal point followed by up to 16 digits of precision, but not numbers that have e.g. lots of ones instead of those zeros -- the float portion would be used up for the first 16 ones. E.g.
111111111111111111111111111111.000000000000000000000000000000123456789
would be representable exactly but not
111111111111111111111111111111.111111111111111111111111111111123456789
What makes numbers in the vicinity of integers special?
-- --Guido van Rossum (python.org/~guido)
On Tue, Mar 11, 2014 at 8:57 AM, Guido van Rossum guido@python.org wrote:
It looks neither plausible nor tempting to me at all, and I hope that's not what he meant. It can represent numbers of any magnitude that have lots of zeros following the decimal point followed by up to 16 digits of precision, but not numbers that have e.g. lots of ones instead of those zeros -- the float portion would be used up for the first 16 ones. E.g.
111111111111111111111111111111.000000000000000000000000000000123456789
would be representable exactly but not
111111111111111111111111111111.111111111111111111111111111111123456789
What makes numbers in the vicinity of integers special?
Hmm, good point. I was thinking this would give a predictable 16 digits of precision after the decimal, but the leading zeroes are somewhat special. But when I said "tempting" I meant that it looks initially nice, and then went on to show that it's not so nice on analysis - which latter part you're also demonstrating.
ChrisA
Guido van Rossum guido@... writes:
On Mon, Mar 10, 2014 at 2:51 PM, Chris Angelico rosuav@gmail.com wrote: On Tue, Mar 11, 2014 at 7:57 AM, Ethan Furman ethan@stoneleaf.us wrote:
I think he's saying make a new type similarly to complex, only instead of two floats to make a complex number, have a (long) int and a decimal float to make this new type. The long int portion would have infinite precision, the float portion would have, say, 16 digits (or whatever). That's plausible as a representation, and looks tempting, but basic arithmetic operations become more complicated. Addition and subtraction just need to worry about carries, but multiplication forks out into four multiplications (intint, intfrac, fracint, fracfrac), and division becomes similarly complicated. Would it really be beneficial?
It looks neither plausible nor tempting to me at all, and I hope that's not what he meant. It can represent numbers of any magnitude that have lots of zeros following the decimal point followed by up to 16 digits of precision, but not numbers that have e.g. lots of ones instead of those zeros -- the float portion would be used up for the first 16 ones. E.g.111111111111111111111111111111.000000000000000000000000000000123456789
would be representable exactly but not111111111111111111111111111111.111111111111111111111111111111123456789
What makes numbers in the vicinity of integers special?
I'm afraid it is exactly what I'm proposing. I don't see though how this is different from current behavior of lets say Decimal. Assuming default context with prec=28 you currently get:
+Decimal('0.000000000000000000000000000000123456789') Decimal('1.23456789E-31')
, but with a "consumer 1" (one is enough, actually):
+Decimal('0.100000000000000000000000000000123456789') Decimal('0.1000000000000000000000000000')
Cheers, Wolfgang
On Mon, Mar 10, 2014 at 3:36 PM, Wolfgang Maier wolfgang.maier@biologie.uni-freiburg.de wrote:
Guido van Rossum guido@... writes:
> > >
On Mon, Mar 10, 2014 at 2:51 PM, Chris Angelico rosuav@gmail.com wrote: On Tue, Mar 11, 2014 at 7:57 AM, Ethan Furman ethan@stoneleaf.us wrote:
I think he's saying make a new type similarly to complex, only instead of two floats to make a complex number, have a (long) int and a decimal float to make this new type. The long int portion would have infinite precision, the float portion would have, say, 16 digits (or whatever). That's plausible as a representation, and looks tempting, but basic arithmetic operations become more complicated. Addition and subtraction just need to worry about carries, but multiplication forks out into four multiplications (intint, intfrac, fracint, fracfrac), and division becomes similarly complicated. Would it really be beneficial?
It looks neither plausible nor tempting to me at all, and I hope that's not what he meant. It can represent numbers of any magnitude that have lots of zeros following the decimal point followed by up to 16 digits of precision, but not numbers that have e.g. lots of ones instead of those zeros -- the float portion would be used up for the first 16 ones. E.g.111111111111111111111111111111.000000000000000000000000000000123456789
would be representable exactly but not111111111111111111111111111111.111111111111111111111111111111123456789
What makes numbers in the vicinity of integers special?
I'm afraid it is exactly what I'm proposing. I don't see though how this is different from current behavior of lets say Decimal. Assuming default context with prec=28 you currently get:
+Decimal('0.000000000000000000000000000000123456789') Decimal('1.23456789E-31')
, but with a "consumer 1" (one is enough, actually):
+Decimal('0.100000000000000000000000000000123456789') Decimal('0.1000000000000000000000000000')
It is very different. Decimal with prec=28 counts the number of digits from the first non-zero digit, and the number of digits it gives you after the decimal point depends on how many digits there are before it. That is a sane perspective on precision (the total number of significant digits).
But in your proposal the number of digits you get after the point depends on how close the value is to the nearest integer (in the direction of zero), not how many significant digits you have in total. That's why my examples started with lots of ones, not zeros.
-- --Guido van Rossum (python.org/~guido)
Chris Angelico rosuav@... writes:
On Tue, Mar 11, 2014 at 7:57 AM, Ethan Furman ethan@... wrote:
I think he's saying make a new type similarly to complex, only instead of two floats to make a complex number, have a (long) int and a decimal float to make this new type. The long int portion would have infinite precision, the float portion would have, say, 16 digits (or whatever).
That's plausible as a representation, and looks tempting, but basic arithmetic operations become more complicated. Addition and subtraction just need to worry about carries, but multiplication forks out into four multiplications (intint, intfrac, fracint, fracfrac), and division becomes similarly complicated. Would it really be beneficial?
That's right, it would complicate calculations exactly as you are pointing out. That's why I said it might be too slow for most users even when implemented optimally, but then I'm not sure whether performance should be the very first thing to discuss.
Best, Wolfgang
On 03/10/2014 12:48 PM, Wolfgang Maier wrote:
Going back to Mark Harris' initial proposal of unifying numeric types (which I am not trying to support in any way here), such a type would even allow to unify int and float since an ints could be considered a subset of the new type with a fractional part of zero.
This turned out a bit longer than I intended. I think it is the correct approach to what I think is the intent of the number unifications suggestions. Trim it down a bit, and add some examples and it's a good start to a PEP. ;-)
I think adding a new number type isn't going to help that much, although it won't hurt either.
(Hoping this is a more practical direction for these discussions.)
Add a BaseValue class for creating context dependent values like a CurrencyValue class, or StatisticValue class.
The Value class should not specify any particular number type. That is an implementation detail of a subclass.
The Value class will define how to round, compare, and track error ranges when used in calculations. And will use ValueContext objects to adjust that as needed.
Add a ValueContext class to create context parameter objects. These objects will be used to set a Value class's context for rounding, number of significant digits, and tracking error ranges.
The ValueContext objects should be very simple, and loosely defined in the same way slice objects are. They are only used for passing the context parameters. (Like slice objects pass index's, but probably by called methods rather than by syntax.)
There have been a number of suggestions to unify numbers in order to make them easier to use for less mathematically inclined users to do more complex calculations without having to know quite as much about the underlying type of number. There is no magic wand that will make it completely painless, but we can make things easier to get right to an average programmer with average to good math skills. (Probably the normal group for most beginner programmer students.)
The basic issue I see isn't with the number types, but with how we use them. (And how we can us them better) There are a lot of parts that need to work both independently and together at the same time.
Currently It's up to the programmer to keep track of significant digits (and significant error), any rounding that may need to be done, and to do it all correctly. There are many places where it can go wrong, and where getting it right requires a good understanding of these concepts within the speciality they are being used in.
The Decimal module tries to reduce that work, and also reduce errors, and increase overall accuracy, all at the same time. But that doesn't help the existing number types. Also while just using the decimal type isn't that difficult, using the other features it has correctly isn't as easy.
It's not easy to mix and match these ideas and come up with a usable, and reusable, interface for doing many similar types of calculations correctly as each value in a calculation (or function call), may not need the same exact requirements.
Possibly the broadest, but still practical viewpoint, gives three kinds of numbers. A number kind as described here, refers to how they are used, and is independent of the way it's stored in the computer. The number kind is what determines some of the requirements of a calculation or comparison.
If we consider that how a number is used as independent (as much as is reasonable) from how the number is stored in memory, then maybe the type is less important than the context.
Measurements: Values which have limited accuracy and need both proper rounding and tracking of error ranges.
Exact numbers: Values that really do represent exactly what they are. Usually smaller sized numbers, and usually not fractional amounts. (includes counters and index's for example)
Ideal numbers: Values that are not measurements but can be calculated as accurately as needed and generally don't require keeping track of error because the significant digits can far exceed any measurements.
Ints, and floating point actually do a very nice job of representing 2. and 3 in most cases. And they generally don't need to have a contexts associated to them, and usually don't need rounding. For example, we don't want to round index's and counters, they are in the exact group. And we can calculate Pi to what ever we need so that any error range in the value is less than significant.
Another issue is with equality and how a value's exactness effects that. For example if we have two values with an error of +- 1. We can say it has a width of 3. (Or inexactness of 3)
So we might have this case where two values overlap when you consider the width of the value. (it's +- amount)
(23 +- 1) < (24 +- 1) --> False
Of course those would be objects instead of literals. (weather or not they are decimal, int, or float, isn't important.)
This can be extended to all the other comparison operators. The usual way of dealing with it is to do an error range test, which adds a lot of noise to an otherwise simple operation. Especially if you need to put those in many different places. They aren't quite complex enough to justify functions, but complex enough to be bothersome and can be a common source of errors.
Exactness, (and rounding, as it's directly related to exactness), are independent concepts from how the number is stored in a computer. (Unless you use the wrong type of course, but then I would consider it a bug.)
One approach is to have a Value class, and a ValueContext object. The ValueContext object may be more like a slice object. It would be up to the Value class to interpret it so that it can know how to do the operator methods correctly... __round__, __add__, __eq__, etc...
A base Value class could be used to define types such as a Measurement class, Or a Currency class, (and others).
The context API needs to be on a separate Value class/type rather than on the number in my opinion. I think trying to make numbers type also be a value is what gets us mixed up. The number is just a scaler for the value. It might also be called vector, but that is a technical term that can be confused with other concepts.
Doesn't change any existing number type.
Value objects work with any number types.
Value objects can have a repr that is much nicer, and more meaningful than the underlying number type.
Context sub-class's define how they are used (or act) rather than by what kind of computer number they may contain.
Context (slice like objects) that can be shared, and/or passed between Value objects as needed to keep track of rounding, significant digits, and error range.
Can be a library module.
If Value objects is something that can be included into python, then a simpler decimal64 type without all the context settings could be used as it's just another computer number type with a bit more accuracy if you need it.
A Currency Value type could use Decimal, and the type can be swapped out later for the decimal64, as that becomes just an implementation detail for he Currency Value class.
Users could do something like that now by defining classes, and probably have done it within their applications. I consider that is a supporting argument for this approach, rather than an argument against it. It still takes a fair amount of math knowledge how to get it right.
Making context dependent calculations easier to do, and use, would be very nice, and modules using these Value objects could possibly work together better.
A good test might be how "Value" objects could be used in the statistics module. Or maybe Mark Harris could try out this idea in his new module.
I think this is the sort of thing Mark was initially looking for. It's really not that complicated if the order of dependencies is correctly ordered. (probably he's got something close to this in it already, just with a different name. (Just a guess, but any solution that solves some of this may not be that far off.)
BTW, The new literals would be helpful with this too. +1
Cheers, Ron
Wolfgang Maier wolfgang.maier@... writes:
Am Montag, 10. März 2014 14:53:42 UTC+1 schrieb Stefan Krah:
[My apologies for being terse, I don't have much time to follow this discussion right now.]
Nick Coghlan <ncoghlan <at> gmail.com> wrote:
I think users of decimal literals will just need to deal with the risk of unexpected rounding, as the alternatives are even more problematic.
That is why I think we should seriously consider moving to IEEE semantics for a decimal literal. Among other things:
While I find this discussion about decimal literals extremely interesting, in my opinion, such a literal should have an underlying completely new numerical type, if it is really supposed to be for inexperienced users.
Most of the discussions right now concern rounding issues that occur after the decimal point, but I think an at least equally big problem is rounding to the left of it as in (using current float):
1e50 + 1000 1e+50
Importantly, Decimal is no cure here:
Decimal(10**50) + Decimal(100) Decimal('1.000000000000000000000000000E+50')
(of course, you can debate context settings to make this particular example work, but, in general, it happens with big enough numbers.)
The solution for this example is using ints of course:
10**50 + 100 100000000000000000000000000000000000000000000000100
, but obviously this works only for whole numbers, so there currently is no built-in way to make this example work correctly:
10**50 - 9999999999999999.5 1e+50
(or with Decimal:
Decimal(10**50) - Decimal('9999999999999999.5') Decimal('1.000000000000000000000000000E+50') ).
If we are discussing a new number literal I would like to have it cope with this and my suggestion is a new numeric type that's sort of a hybrid between int and either Decimal or float, i.e., a type that behaves like int for digits left of the decimal point, but may truncate to the right. In pure Python, something similar (but not a literal form of course) could be implemented as a class that stores the left digits as an int and the right digits as a float/Decimal internally. Calculations involving this class would be slow due to the constant need of shifting digits between the integer´and the float part, and might be too slow for many users even when written in C, but its behavior would meet the expectation of inexperienced people better than the existing types. Going back to Mark Harris' initial proposal of unifying numeric types (which I am not trying to support in any way here), such a type would even allow to unify int and float since an ints could be considered a subset of the new type with a fractional part of zero.
So, I've tried my hands on the pure Python implementation of this (code below) to make it easier for people to play around with this idea. I found it easier in the end to represent both the integral and the fractional part of the new type as Python ints (and restricting the precision for the fractional component). In the compound object the integral part has arbitrary precision, the fractional part a fixed precision of 28 digits by default (and in response to Guido: this is constant even with leading zeros), but this can be changed temporarily through a context manager (a very simple one though).
This is a very preliminary version that is incomplete still and could certainly be optimized in many places. It's also slow, though maybe not as slow as I thought (it's definitely fun to work with it on simple problems in interactive mode).
Feedback is welcome, Wolfgang
Here's the code:
""" Test of a new numeric class PyNumber.
Usage:
Generate a new PyNumber with a fractional component of 0 (equal to an int):
PyNumber(3) PyNumber('3.0')
Generate a PyNumber with a fractional component Decimal('0.1'):
PyNumber(3, Decimal('0.1')) PyNumber('3.1')
Generate the same PyNumber by specifying fractional as int and exponent:
PyNumber(3, 1, exp = 1) PyNumber('3.1')
Generate the same PyNumber again using a string:
PyNumber('3.1') PyNumber('3.1')
Control decimal precision through context manager (default precision is 28):
with Prec(3): PyNumber('0.333333333333333') PyNumber('0.333')
with Prec(7): PyNumber('1') / PyNumber('3') PyNumber('0.3333333')
Addition, subtraction, multiplication and division work as expected, but currently only between PyNumbers.
"""
from decimal import * from math import log, floor
max_exp = 28
class Prec (object): def __init__ (self, prec): self.prec = prec
def __enter__ (self):
global max_exp
self.old_prec = max_exp
max_exp = self.prec
def __exit__ (self, *_):
global max_exp
max_exp = self.old_prec
class PyNumber (object): """Numeric type for intuitive calculations.
Supports arbitrary precision of integral component, i.e., left of the
decimal point, and provides fixed precision (module default: 28 digits)
for the fractional component, i.e. right of the decimal point.
Behaves well with calculations involving huge and small numbers, for which
float and Decimal produce rounding errors.
Modified sum example from statistics module:
>>> sum([1e50, 1.1, -1e50] * 1000) # floating point calculation gives zero.
0.0
>>> sum([PyNumber(10**50), PyNumber('1.1'), PyNumber(-10**50)] * 1000,
PyNumber(0)) PyNumber('1100.0')
"""
def __init__ (self, integral, fractional = 0, exp = 0):
# PyNumber objects are composed of two integers, one unbounded for
# representing the integral part of a number, the other restricted to
# max_exp digits for representing the fractional component.
if isinstance(integral, int):
if integral >= 0:
self.integral = integral
self.sign = 1
else:
self.integral = -integral
self.sign = -1
if fractional < 0:
raise TypeError ('negative fraction not allowed')
if isinstance(fractional, int):
self.fractional = fractional
if fractional > 0:
if exp == 0:
self.exp = floor(log(fractional, 10)+1)
else:
self.exp = exp
elif fractional == 0:
self.exp = 0
else:
raise TypeError ('Unknown error caused by fractional.')
elif isinstance(fractional, Decimal):
if fractional >= 1:
raise ValueError ('need a fraction < 1.')
sign, digits, exp = fractional.as_tuple()
self.fractional = 0
for digit in digits:
self.fractional = self.fractional*10 + digit
self.exp = -exp
else:
raise TypeError ('Expected int or Decimal for fractional part.')
elif isinstance(integral, str):
x = integral.split('.')
if len(x) > 2:
raise ValueError('Float format string expected.')
self.integral = int(x[0])
if self.integral >= 0:
self.sign = 1
else:
self.integral = -self.integral
self.sign = -1
if len(x) == 2:
self.fractional = int(x[1])
else:
self.fractional = 0
if self.fractional > 0:
self.exp = len(x[1])
else:
self.exp = 0
else:
raise TypeError('Expected int or str as first argument.')
if self.exp > max_exp:
self.fractional //= 10**(self.exp-max_exp)
self.exp = max_exp
def __add__ (self, other):
if self.sign == -1:
if other.sign == 1:
# -1 + 2 = 2 - 1
return PyNumber(*other._sub_absolute(self))
if other.sign == -1:
if self.sign == 1:
# 1 + (-2) = 1 - 2
return PyNumber(*self._sub_absolute(other))
if self.exp >= other.exp:
integral, fractional = self._add_absolute(other)
else:
integral, fractional = other._add_absolute(self)
return PyNumber(self.sign * integral, fractional, self.exp)
def _add_absolute(self, other):
# add integral components.
integral = self.integral + other.integral
# adjust and add fractional components
fractional = self.fractional + other.fractional *
10**(self.exp-other.exp)
# determine overflow of fractional sum into integral part
overflow = fractional // 10**self.exp
if overflow:
integral += overflow
fractional -= overflow * 10**self.exp
return integral, fractional
def __sub__ (self, other):
if self.sign == -1 and other.sign == 1:
# -1 - 2 = -(1 + 2)
integral, fractional = self._add_absolute(other)
return PyNumber(-integral, fractional, self.exp)
if other.sign == -1 and self.sign == 1:
# 1 - (-2) = 1 + 2
return PyNumber(*self._add_absolute(other), exp = self.exp)
if other.sign == -1 and self.sign == -1:
# -1 - (-2) = 2 - 1
return PyNumber(*other._sub_absolute(self))
return PyNumber(*self._sub_absolute(other))
def _sub_absolute (self, other):
sign = 1
# merge integral and fractional components into big ints.
a = self.integral * 10**self.exp + self.fractional
b = other.integral * 10**other.exp + other.fractional
# adjust the two representations.
if self.exp > other.exp:
b *= 10**(self.exp-other.exp)
elif other.exp > self.exp:
a *= 10**(other.exp-self.exp)
# subtract.
diff = a-b
if diff < 0:
diff = -diff
sign = -1
# split the big int back into integral and fractional.
shift = max(self.exp, other.exp)
integral, fractional = divmod(diff, 10**shift)
return sign * integral, fractional, shift
def __mul__ (self, other):
if self.exp < other.exp:
return other * self
# multiply the integral components.
p1 = self.integral * other.integral
# adjust the fractional parts.
f1 = self.fractional
f2 = other.fractional * 10**(self.exp-other.exp)
# cross-multiply integral and fractional components.
p2 = (
self.integral * f2 * 10**self.exp +
other.integral * f1 * 10**self.exp +
f1 * f2)
# determine overflow of cross-multiplication into integral part.
shift = 2 * self.exp
overflow = p2 // 10**shift
if overflow > 0:
p1 += overflow
p2 -= overflow * 10**shift
# truncate trailing zeros.
while not p2 % 10:
p2 //= 10
shift -= 1
sign = self.sign * other.sign
return PyNumber(sign*p1, p2, shift)
def __truediv__ (self, other):
# merge integral and fractional components into big ints.
n = self.integral * 10**self.exp + self.fractional
d = other.integral * 10**other.exp + other.fractional
# adjust the two representations.
if self.exp > other.exp:
d *= 10**(self.exp-other.exp)
elif other.exp > self.exp:
n *= 10**(other.exp-self.exp)
# divide.
integral, remainder = divmod(n, d)
# turn remainder into fractional component.
remainder *= 10
shift = 1
while remainder % d and shift < max_exp:
remainder *= 10
shift += 1
fractional = remainder // d
sign = self.sign * other.sign
return PyNumber(sign * integral, fractional, shift)
def __int__ (self):
return self.integral
def __float__ (self):
return self.sign * (self.integral + self.fractional / 10**self.exp)
def __repr__ (self):
return "PyNumber('{0}.{1}{2}')".format(self.sign * self.integral,
'0'*(self.exp-len(str(self.fractional))), self.fractional)
This representation makes more sense, it is fixed point. But you can just use a single integer and keep track of where the point should be. On Mar 12, 2014 8:34 AM, "Wolfgang Maier" wolfgang.maier@biologie.uni-freiburg.de wrote:
Wolfgang Maier wolfgang.maier@... writes:
>
Am Montag, 10. März 2014 14:53:42 UTC+1 schrieb Stefan Krah:
[My apologies for being terse, I don't have much time to follow this discussion right now.]
Nick Coghlan <ncoghlan <at> gmail.com> wrote:
I think users of decimal literals will just need to deal with the risk of unexpected rounding, as the alternatives are even more problematic.
That is why I think we should seriously consider moving to IEEE semantics for a decimal literal. Among other things:
While I find this discussion about decimal literals extremely interesting, in my opinion, such a literal should have an underlying completely new numerical type, if it is really supposed to be for inexperienced users.
Most of the discussions right now concern rounding issues that occur after the decimal point, but I think an at least equally big problem is rounding to the left of it as in (using current float):
1e50 + 1000 1e+50
Importantly, Decimal is no cure here:
Decimal(10**50) + Decimal(100) Decimal('1.000000000000000000000000000E+50')
(of course, you can debate context settings to make this particular example work, but, in general, it happens with big enough numbers.)
The solution for this example is using ints of course:
10**50 + 100 100000000000000000000000000000000000000000000000100
, but obviously this works only for whole numbers, so there currently is no built-in way to make this example work correctly:
10**50 - 9999999999999999.5 1e+50
(or with Decimal:
Decimal(10**50) - Decimal('9999999999999999.5') Decimal('1.000000000000000000000000000E+50') ).
If we are discussing a new number literal I would like to have it cope with this and my suggestion is a new numeric type that's sort of a hybrid between int and either Decimal or float, i.e., a type that behaves like int for digits left of the decimal point, but may truncate to the right. In pure Python, something similar (but not a literal form of course) could be implemented as a class that stores the left digits as an int and the right digits as a float/Decimal internally. Calculations involving this class would be slow due to the constant need of shifting digits between the integer´and the float part, and might be too slow for many users even when written in C, but its behavior would meet the expectation of inexperienced people better than the existing types. Going back to Mark Harris' initial proposal of unifying numeric types (which I am not trying to support in any way here), such a type would even allow to unify int and float since an ints could be considered a subset of the new type with a fractional part of zero.
So, I've tried my hands on the pure Python implementation of this (code below) to make it easier for people to play around with this idea. I found it easier in the end to represent both the integral and the fractional part of the new type as Python ints (and restricting the precision for the fractional component). In the compound object the integral part has arbitrary precision, the fractional part a fixed precision of 28 digits by default (and in response to Guido: this is constant even with leading zeros), but this can be changed temporarily through a context manager (a very simple one though).
This is a very preliminary version that is incomplete still and could certainly be optimized in many places. It's also slow, though maybe not as slow as I thought (it's definitely fun to work with it on simple problems in interactive mode).
Feedback is welcome, Wolfgang
Here's the code:
""" Test of a new numeric class PyNumber.
Usage:
Generate a new PyNumber with a fractional component of 0 (equal to an int):
PyNumber(3) PyNumber('3.0')
Generate a PyNumber with a fractional component Decimal('0.1'):
PyNumber(3, Decimal('0.1')) PyNumber('3.1')
Generate the same PyNumber by specifying fractional as int and exponent:
PyNumber(3, 1, exp = 1) PyNumber('3.1')
Generate the same PyNumber again using a string:
PyNumber('3.1') PyNumber('3.1')
Control decimal precision through context manager (default precision is 28):
with Prec(3): PyNumber('0.333333333333333') PyNumber('0.333')
with Prec(7): PyNumber('1') / PyNumber('3') PyNumber('0.3333333')
Addition, subtraction, multiplication and division work as expected, but currently only between PyNumbers.
"""
from decimal import * from math import log, floor
max_exp = 28
class Prec (object): def __init__ (self, prec): self.prec = prec
def __enter__ (self):
global max_exp
self.old_prec = max_exp
max_exp = self.prec
def __exit__ (self, *_):
global max_exp
max_exp = self.old_prec
class PyNumber (object): """Numeric type for intuitive calculations.
Supports arbitrary precision of integral component, i.e., left of the
decimal point, and provides fixed precision (module default: 28 digits)
for the fractional component, i.e. right of the decimal point.
Behaves well with calculations involving huge and small numbers, for
which float and Decimal produce rounding errors.
Modified sum example from statistics module:
>>> sum([1e50, 1.1, -1e50] * 1000) # floating point calculation gives
zero. 0.0
>>> sum([PyNumber(10**50), PyNumber('1.1'), PyNumber(-10**50)] *
1000,
PyNumber(0)) PyNumber('1100.0')
"""
def __init__ (self, integral, fractional = 0, exp = 0):
# PyNumber objects are composed of two integers, one unbounded for
# representing the integral part of a number, the other restricted
to
# max_exp digits for representing the fractional component.
if isinstance(integral, int):
if integral >= 0:
self.integral = integral
self.sign = 1
else:
self.integral = -integral
self.sign = -1
if fractional < 0:
raise TypeError ('negative fraction not allowed')
if isinstance(fractional, int):
self.fractional = fractional
if fractional > 0:
if exp == 0:
self.exp = floor(log(fractional, 10)+1)
else:
self.exp = exp
elif fractional == 0:
self.exp = 0
else:
raise TypeError ('Unknown error caused by fractional.')
elif isinstance(fractional, Decimal):
if fractional >= 1:
raise ValueError ('need a fraction < 1.')
sign, digits, exp = fractional.as_tuple()
self.fractional = 0
for digit in digits:
self.fractional = self.fractional*10 + digit
self.exp = -exp
else:
raise TypeError ('Expected int or Decimal for fractional
part.') elif isinstance(integral, str): x = integral.split('.') if len(x) > 2: raise ValueError('Float format string expected.') self.integral = int(x[0]) if self.integral >= 0: self.sign = 1 else: self.integral = -self.integral self.sign = -1 if len(x) == 2: self.fractional = int(x[1]) else: self.fractional = 0 if self.fractional > 0: self.exp = len(x[1]) else: self.exp = 0 else: raise TypeError('Expected int or str as first argument.') if self.exp > max_exp: self.fractional //= 10**(self.exp-max_exp) self.exp = max_exp
def __add__ (self, other):
if self.sign == -1:
if other.sign == 1:
# -1 + 2 = 2 - 1
return PyNumber(*other._sub_absolute(self))
if other.sign == -1:
if self.sign == 1:
# 1 + (-2) = 1 - 2
return PyNumber(*self._sub_absolute(other))
if self.exp >= other.exp:
integral, fractional = self._add_absolute(other)
else:
integral, fractional = other._add_absolute(self)
return PyNumber(self.sign * integral, fractional, self.exp)
def _add_absolute(self, other):
# add integral components.
integral = self.integral + other.integral
# adjust and add fractional components
fractional = self.fractional + other.fractional *
10**(self.exp-other.exp)
# determine overflow of fractional sum into integral part
overflow = fractional // 10**self.exp
if overflow:
integral += overflow
fractional -= overflow * 10**self.exp
return integral, fractional
def __sub__ (self, other):
if self.sign == -1 and other.sign == 1:
# -1 - 2 = -(1 + 2)
integral, fractional = self._add_absolute(other)
return PyNumber(-integral, fractional, self.exp)
if other.sign == -1 and self.sign == 1:
# 1 - (-2) = 1 + 2
return PyNumber(*self._add_absolute(other), exp = self.exp)
if other.sign == -1 and self.sign == -1:
# -1 - (-2) = 2 - 1
return PyNumber(*other._sub_absolute(self))
return PyNumber(*self._sub_absolute(other))
def _sub_absolute (self, other):
sign = 1
# merge integral and fractional components into big ints.
a = self.integral * 10**self.exp + self.fractional
b = other.integral * 10**other.exp + other.fractional
# adjust the two representations.
if self.exp > other.exp:
b *= 10**(self.exp-other.exp)
elif other.exp > self.exp:
a *= 10**(other.exp-self.exp)
# subtract.
diff = a-b
if diff < 0:
diff = -diff
sign = -1
# split the big int back into integral and fractional.
shift = max(self.exp, other.exp)
integral, fractional = divmod(diff, 10**shift)
return sign * integral, fractional, shift
def __mul__ (self, other):
if self.exp < other.exp:
return other * self
# multiply the integral components.
p1 = self.integral * other.integral
# adjust the fractional parts.
f1 = self.fractional
f2 = other.fractional * 10**(self.exp-other.exp)
# cross-multiply integral and fractional components.
p2 = (
self.integral * f2 * 10**self.exp +
other.integral * f1 * 10**self.exp +
f1 * f2)
# determine overflow of cross-multiplication into integral part.
shift = 2 * self.exp
overflow = p2 // 10**shift
if overflow > 0:
p1 += overflow
p2 -= overflow * 10**shift
# truncate trailing zeros.
while not p2 % 10:
p2 //= 10
shift -= 1
sign = self.sign * other.sign
return PyNumber(sign*p1, p2, shift)
def __truediv__ (self, other):
# merge integral and fractional components into big ints.
n = self.integral * 10**self.exp + self.fractional
d = other.integral * 10**other.exp + other.fractional
# adjust the two representations.
if self.exp > other.exp:
d *= 10**(self.exp-other.exp)
elif other.exp > self.exp:
n *= 10**(other.exp-self.exp)
# divide.
integral, remainder = divmod(n, d)
# turn remainder into fractional component.
remainder *= 10
shift = 1
while remainder % d and shift < max_exp:
remainder *= 10
shift += 1
fractional = remainder // d
sign = self.sign * other.sign
return PyNumber(sign * integral, fractional, shift)
def __int__ (self):
return self.integral
def __float__ (self):
return self.sign * (self.integral + self.fractional / 10**self.exp)
def __repr__ (self):
return "PyNumber('{0}.{1}{2}')".format(self.sign * self.integral,
'0'*(self.exp-len(str(self.fractional))), self.fractional)
Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Guido van Rossum guido@... writes:
This representation makes more sense, it is fixed point. But you can just use a single integer and keep track of where the point should be.
Right, the main reason why I haven't tried that is that I started out with an int for representing the integral part and a Decimal for the rest, so from the beginning I always had two separate objects in the code. On the other hand, I assume that a single int might also slow down the trimming of the fractional part, which is necessary to keep its precision fixed ?
On Wed, Mar 12, 2014 at 8:48 AM, Wolfgang Maier wolfgang.maier@biologie.uni-freiburg.de wrote:
Guido van Rossum guido@... writes:
This representation makes more sense, it is fixed point. But you can just use a single integer and keep track of where the point should be.
Right, the main reason why I haven't tried that is that I started out with an int for representing the integral part and a Decimal for the rest, so from the beginning I always had two separate objects in the code. On the other hand, I assume that a single int might also slow down the trimming of the fractional part, which is necessary to keep its precision fixed ?
You can't be sure without timing it, but my expectation is that unless the number of digits is really large (like in the 1000s or more) the fewer number of operations will always be quicker.
-- --Guido van Rossum (python.org/~guido)
On Wed, Mar 12, 2014, at 11:37, Guido van Rossum wrote:
This representation makes more sense, it is fixed point. But you can just use a single integer and keep track of where the point should be.
If you're keeping track of it other than implicitly, isn't that floating point? Just of a NNNNNNN10^-X variety instead of 1.NNNNNN2^X.
On Wed, Mar 12, 2014 at 8:49 AM, random832@fastmail.us wrote:
On Wed, Mar 12, 2014, at 11:37, Guido van Rossum wrote:
This representation makes more sense, it is fixed point. But you can just use a single integer and keep track of where the point should be.
If you're keeping track of it other than implicitly, isn't that floating point? Just of a NNNNNNN10^-X variety instead of 1.NNNNNN2^X.
No, his rules for how many digits to keep (everything to the left, only 28 to the right) make it fixed-point. Keeping track of the number explicitly just makes it simpler to create variants where instead of 28 you use another number of digits.
But the use cases for this kind of representation are pretty limited.
-- --Guido van Rossum (python.org/~guido)
Guido van Rossum guido@... writes:
This representation makes more sense, it is fixed point. But you can just use a single integer and keep track of where the point should be.
One of the reasons I started out with two separately stored parts has to do with my initial motivation for forking the parallel Python Numbers as Human Concept Decimal System discussion.
That one is about a new numeric 'd' literal and the discussion now seems to go in the direction of having this generate a new number type closely related to the current decimal.Decimal, but somewhat simplified.
I definitely agree with Oscar Benjamin on his point that decimal.Decimal from the stdlib is too complex (especially with Contexts) to be made accessible through a literal constructor, but I would even go a step further than him and say that if such a new number type gets introduced it should have additional properties that decimal.Decimal doesn't have.
One of the concerns about a new type seems to be that it makes Python's number system even more complex rather than simplifying it, but my suggestion would be to make the new literal create a compound number type behaving like my PyNumber class.
Please regard my use of two integers as an implementation detail here. My preference for a built-in type would actually be an integer for the integral part, but a decimal for the fractional part, it was just a bit harder to code up quickly. I envision this decimal part to be essentially the simplified decimal that Oscar Benjamin advocates for, so it would have most of the power of decimal.Decimal, but it would only be concerned with the fractional part of numbers.
The user would never directly encounter this new decimal type, but instead work with the compound "PyNumber" type. I agree, this still introduces a new type, but, importantly, this new type could be considered a superset of int. In fact, every int could be thought of as a "PyNumber" with a fractional part of 0 (just like real numbers could be thought of as complex with an imaginary part of 0j). In the long run, it would thus be possible to make, e.g., division of integers return a "PyNumber" and to replace float in conversions from an integral number to a fractional one. Then there would be no need anymore to expose the skeleton of number representations by things like:
float(math.factorial(500)) Traceback (most recent call last): File "<pyshell#6>", line 1, in <module> float(math.factorial(500)) OverflowError: long int too large to convert to float
and there would be no more precision loss like this occurring anymore:
a = 26313083693369353016721801216 a + .5 2.631308369336935e+28
So, essentially I would prefer a potential new number type to improve the user-experience with Python numbers in more ways than currently discussed with the 'd' literal - and to come back full circle now:
Having a separate representation for the type's integral component could be advantageous when this is expected to be combined with ints a lot.
Cheers, Wolfgang