Re: [Python-ideas] SI scale factors alone, without units or dimensional analysis

26 Aug 2016

      On Fri, Aug 26, 2016 at 10:47 PM, Steven D'Aprano  wrote:
...
(1) Are the results floats, ints, or something else?
I would expect that 1K would be int 1000, not float 1000. But what about
fractional prefixes, like 1m? Should that be a float or a decimal?
If I write 7981m I would expect 7.981, not 7.9809999999999999, so maybe
I want a decimal float, not a binary float?
Introduce "d" as a prefix meaning 1, and this could be the way of
creating something that people have periodically asked for: Decimal
literals.

(Though IIRC there were some complexities involving Decimal literals
and decimal.getcontext(), which would have to be resolved before 1m
could represent a Decimal.)
...
Actually, what I would really want is for the scale factor to be tracked
separately. If I write 7981m * 1M, I should end up with 7981000 as an
int, not a float. Am I being unreasonable?
Easy. Make them Fraction literals instead. You'll end up with
7981000/1 as a rational, rather than a pure int, but if you want
accurate handling of SI prefixes, rationals will serve you fairly
well.
...
Obviously if I write 1.1K then I'm expecting a float. So I'm not
*entirely* unreasonable :-)
Obviously :)
...
(2) Decimal or binary scale factors?
The SI units are all decimal, and I think if we support these, we should
insist that K == 1000, not 1024. For binary scale factors, there is the
IEC standard:
http://physics.nist.gov/cuu/Units/binary.html
which defines Ki = 2**10, Mi = 2**20, etc. (Fortunately this doesn't
have to deal with fractional prefixes.) So it would be easy enough to
support them as well.
from __future__ import binary_scale_factors as scale_factors
from __future__ import decimal_scale_factors as scale_factors
# tongue only partly in cheek
...
(3) µ or u, k or K?
I'm going to go to the barricades to fight for the real SI prefixes µ
and k to be supported. If people want to support the common fakes u and
K as well, that's fine, I have no objection, but I think that its
important to support the actual prefixes too.
I would strongly support the use of µ and weakly u. With k vs K, no
opinion. If both can be supported without being confusing, grab 'em
both. With output formats, it's less clear, but I would still be
inclined toward µ for output.
...
(4) What about E?
E is tricky if we want 1E to be read as the integer 10**18, because it
matches the floating point syntax 1E (which is currently a syntax
error). So there's a nasty bit of ambiguity where it may be unclear
whether or not 1E is intended as an int or an incomplete float, and then
there's 1E1E which might be read as 1E1*10**18 or as just an error.
It's worse than that. Currently, 1E+2 is a perfectly legal 100.0
(float), but under this proposal, it would be a constant expression
yielding 1_000_000_000_000_000_002, so it wouldn't just be giving
meaning to things that are currently errors.
...
Replacing E with (say) X is risky. The two largest current SI prefixes
are Z and Y, it seems very likely that the next one added (if that ever
happens) will be X. Actually, using any other letter risks clashing with
a future expansion of the SI prefixes.
Anything's risky. Probably the least risky option is to simply stop
before Exa and implement the feature without.
...
(7) What about repr() and str()?
I don't think that the repr() or str() of numeric types should change.
But perhaps format() could grow some new codes to display numbers using
either the most obvious scale factor, or some specific scale factor.
Agreed. And I'd have them simply pick the one most obvious - if you
want a specific factor, you can simply invert and display.
...
This leads to my first proposal: require an explicit numeric prefix on
numbers before scale factors are allowed, similar to how we treat
non-decimal bases.
8M  # remains a syntax error
0s8M  # unambiguously an int with a scale factor of M = 10**6
0s1E1E  # a float 1E1 with a scale factor of E = 10**18
0s1.E  # a float 1. with a scale factor of E, not an exponent
int('8M')  # remains a ValueError
int('0s8M', base=0)  # returns 8*10**6
Hmm, interesting. Feels clunky but could work.
...
Or if that's too heavy (two whole characters, plus the suffix!) perhaps
we could have a rule that the suffix must follow the final underscore
of the number:
8_M  # int 8*10*6
123_456_789_M  # int 123456789*10**6
123_M_456  # still an error
8._M  # float 8.0*10**6
This sounds better IMO. It's not legal syntax in any version of Python
older than 3.6, so there's minimal backward compatibility trouble.

ChrisA

Re: [Python-ideas] SI scale factors alone, without units or dimensional analysis

Chris Angelico