[Python-ideas] real numbers with SI scale factors

Chris Angelico rosuav at gmail.com
Sun Aug 28 22:33:16 EDT 2016


On Mon, Aug 29, 2016 at 11:44 AM, Ken Kundert
<python-ideas at shalmirane.com> wrote:
> When working with a general purpose programming language, the above numbers
> become:
>
>     780kpc -> 7.8e+05
>     108MPa -> 1.08e+08
>     600TW  -> 6e+14
>     3.2Gb  -> 3.2e+09
>     53pm   -> 5.3e-11
>     $8G    -> 8e+09
>
> Notice that the numbers become longer, harder to read, harder to type, harder to
> say, and harder to hear.
>

And easier to compare. The SI prefixes are almost consistent in using
uppercase for larger units and lowercase for smaller, but not quite;
and there's no particular pattern in which letter is larger. For
someone who isn't extremely familiar with them, that makes them
completely unordered - which is larger, peta or exa? Which is smaller,
nano or pico? Plus, there's a limit to how far you can go with these
kinds of numbers, currently yotta at e+24. Exponential notation scales
to infinity (to 1e308 in IEEE 64-bit binary floats, but plenty further
in decimal.Decimal - I believe its limit is about 1e+(1e6), and REXX
on OS/2 had a limit of 1e+(1e10) for its arithmetic), remaining
equally readable at all scales.

So we can't get rid of exponential notation, no matter what happens.
Mathematics cannot usefully handle a system in which we have to
represent large exponents with ridiculous compound scale factors:

sys.float_info.max = 179.76931348623157*Y*Y*Y*Y*Y*Y*Y*Y*Y*Y*Y*Y*E

(It's even worse if the Exa collision means you stop at Peta.
179.76931348623157*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*P*M, anyone?)

Which means that these tags are a duplicate way of representing a
specific set of exponents.

> Before we expend any more effort on this topic, let's put aside the question of
> how it should be done, or how it will be used after its done, and just focus on
> whether we do it at all. Should Python support real numbers specified with SI
> scale factors as first class citizens?

Except that those are exactly the important questions to be answered.
How *could* it be done? With the units stripped off, your examples
become:

    780k == 7.8e+05 == 780*k
    108M == 1.08e+08 == 108*M
    600T == 6e+14 == 600*T
    3.2G == 3.2e+09 == 3.2*G
    53p == 5.3e-11 == 53*p
    8G == 8e+09 == 8*G

Without any support whatsoever, you can already use the third column
notation, simply by creating this module:

# si.py
k, M, G, T, P, E, Z, Y = 1e3, 1e6, 1e9, 1e12, 1e15, 1e18, 1e21, 1e24
m, μ, n, p, f, a, z, y = 1e-3, 1e-6, 1e-9, 1e-12, 1e-15, 1e-18, 1e-21, 1e-24
u = μ
K = k

And using it as "from si import *" at the top of your code. Do we see
a lot of code in the wild doing this? "[H]ow it will be used after
it's done" is exactly the question that this would answer.

> Don't Python's users in the scientific and engineering communities deserve
> the same treatment?  These are, after all, core communities for Python.

Yes. That's why we have things like the @ matrix multiplication
operator (because the numeric computational community asked for it),
and %-formatting for bytes strings (because the networking, mainly
HTTP serving, community asked for it). Python *does* have a history of
supporting things that are needed by specific sub-communities of
Python coders. But there first needs to be a demonstrable need. How
much are people currently struggling due to the need to transform
"gigapascal" into "e+9"? Can you show convoluted real-world code that
would be made dramatically cleaner by language support?

ChrisA


More information about the Python-ideas mailing list