[Python-ideas] numerical type combining integer and float/decimal properties [Was: Re: Python Numbers as Human Concept Decimal System]

Tue Mar 11 07:36:38 CET 2014

On 03/10/2014 12:48 PM, Wolfgang Maier wrote:
> Going back to Mark Harris' initial proposal of unifying numeric types (which
> I am not trying to support in any way here), such a type would even allow to
> unify int and float since an ints could be considered a subset of the new
> type with a fractional part of zero.

This turned out a bit longer than I intended.  I think it is the correct 
approach to what I think is the intent of the number unifications 
suggestions.  Trim it down a bit, and add some examples and it's a good 
start to a PEP.   ;-)

I think adding a new number type isn't going to help that much, although it 
won't hurt either.

(Hoping this is a more practical direction for these discussions.)

General Proposal:
=================

Add a BaseValue class for creating context dependent values like a 
CurrencyValue class, or StatisticValue class.

The Value class should not specify any particular number type.  That is an 
implementation detail of a subclass.

The Value class will define how to round, compare, and track error ranges 
when used in calculations.  And will use ValueContext objects to adjust 
that as needed.

Add a ValueContext class to create context parameter objects.  These 
objects will be used to set a Value class's context for rounding, number of 
significant digits, and tracking error ranges.

The ValueContext objects should be very simple, and loosely defined in the 
same way slice objects are.  They are only used for passing the context 
parameters.  (Like slice objects pass index's, but probably by called 
methods rather than by syntax.)

Introduction:
=============

There have been a number of suggestions to unify numbers in order to make 
them easier to use for less mathematically inclined users to do more 
complex calculations without having to know quite as much about the 
underlying type of number.  There is no magic wand that will make it 
completely painless, but we can make things easier to get right to an 
average programmer with average to good math skills.  (Probably the normal 
group for most beginner programmer students.)

The basic issue I see isn't with the number types, but with how we use 
them.  (And how we can us them better)  There are a lot of parts that need 
to work both independently and together at the same time.

Currently It's up to the programmer to keep track of significant digits 
(and significant error), any rounding that may need to be done, and to do 
it all correctly.  There are many places where it can go wrong, and where 
getting it right requires a good understanding of these concepts within the 
speciality they are being used in.

The Decimal module tries to reduce that work, and also reduce errors, and 
increase overall accuracy, all at the same time.   But that doesn't help 
the existing number types.  Also while just using the decimal type isn't 
that difficult, using the other features it has correctly isn't as easy.

It's not easy to mix and match these ideas and come up with a usable, and 
reusable, interface for doing many similar types of calculations correctly 
as each value in a calculation (or function call), may not need the same 
exact requirements.

Possibly the broadest, but still practical viewpoint, gives three kinds of 
numbers.  A number kind as described here, refers to how they are used, and 
is independent of the way it's stored in the computer.  The number kind is 
what determines some of the requirements of a calculation or comparison.

If we consider that how a number is used as independent (as much as is 
reasonable) from how the number is stored in memory, then maybe the type is 
less important than the context.

Kinds of Values:
================

1. Measurements: Values which have limited accuracy and need both proper 
rounding and tracking of error ranges.

2. Exact numbers:  Values that really do represent exactly what they are. 
Usually smaller sized numbers, and usually not fractional amounts. 
(includes counters and index's for example)

3. Ideal numbers: Values that are not measurements but can be calculated as 
accurately as needed and generally don't require keeping track of error 
because the significant digits can far exceed any measurements.
    - (Unless you are a theoretical physicist.)
    - (Low quality approximations should be considered a measurement,
       even if it could be calculated accurately to any nth digit.)

Ints, and floating point actually do a very nice job of representing 2. and 
3 in most cases.  And they generally don't need to have a contexts 
associated to them, and usually don't need rounding.  For example, we don't 
want to round index's and counters, they are in the exact group.  And we 
can calculate Pi to what ever we need so that any error range in the value 
is less than significant.

Equality Tests:
===============

Another issue is with equality and how a value's exactness effects that. 
For example if we have two values with an error of +- 1.  We can say it has 
a width of 3.  (Or inexactness of 3)

So we might have this case where two values overlap when you consider the 
width of the value.  (it's +- amount)

      (23 +- 1) < (24 +- 1)  -->  False

Of course those would be objects instead of literals.  (weather or not they 
are decimal, int, or float, isn't important.)

This can be extended to all the other comparison operators.  The usual way 
of dealing with it is to do an error range test, which adds a lot of noise 
to an otherwise simple operation.  Especially if you need to put those in 
many different places.  They aren't quite complex enough to justify 
functions, but complex enough to be bothersome and can be a common source 
of errors.

Exactness, (and rounding, as it's directly related to exactness), are 
independent concepts from how the number is stored in a computer.  (Unless 
you use the wrong type of course, but then I would consider it a bug.)

Values vs Numbers:
=================

One approach is to have a Value class, and a ValueContext object.  The 
ValueContext object may be more like a slice object. It would be up to the 
Value class to interpret it so that it can know how to do the operator 
methods correctly...  __round__, __add__, __eq__, etc...

A base Value class could be used to define types such as a Measurement 
class, Or a Currency class, (and others).

The context API needs to be on a separate Value class/type rather than on 
the number in my opinion.  I think trying to make numbers type also be a 
value is what gets us mixed up. The number is just a scaler for the value. 
  It might also be called vector, but that is a technical term that can be 
confused with other concepts.

Benefits of a Value class:
==========================

* Doesn't change any existing number type.

* Value objects work with any number types.

* Value objects can have a repr that is much nicer, and more meaningful 
than the underlying number type.

* Context sub-class's define how they are used (or act) rather than by what 
kind of computer number they may contain.

* Context (slice like objects) that can be shared, and/or passed between 
Value objects as needed to keep track of rounding, significant digits, and 
error range.

* Can be a library module.

If Value objects is something that can be included into python, then a 
simpler decimal64 type without all the context settings could be used as 
it's just another computer number type with a bit more accuracy if you need it.

A Currency Value type could use Decimal, and the type can be swapped out 
later for the decimal64, as that becomes just an implementation detail for 
he Currency Value class.

Users could do something like that now by defining classes, and probably 
have done it within their applications.  I consider that is a supporting 
argument for this approach, rather than an argument against it.  It still 
takes a fair amount of math knowledge how to get it right.

Making context dependent calculations easier to do, and use, would be very 
nice, and modules using these Value objects could possibly work together 
better.

A good test might be how "Value" objects could be used in the statistics 
module.  Or maybe Mark Harris could try out this idea in his new module.

I think this is the sort of thing Mark was initially looking for.  It's 
really not that complicated if the order of dependencies is correctly 
ordered.  (probably he's got something close to this in it already, just 
with a different name.  (Just a guess, but any solution that solves some of 
this may not be that far off.)

BTW, The new literals would be helpful with this too. +1

Cheers,
    Ron